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Brief Reports: 


A New Publication Procedure 


For a trial period, the Journal of Consult- 
ing Psychology will accept “brief reports” for 
early publication without expense to the au- 
thor, under certain conditions. The new pro- 
cedure is intended to reduce publication lag 
and permit the circulation of worth-while 
studies of specialized interest or minor im- 
portance which cannot be accepted at present 
because of limitations of space. 


A brief report, strictly limited to one printed 
page, may be accepted if the author agrees to 
make a more extended report available. Up to 
four pages in each issue may be devoted to 
brief reports, published in the order of their 
receipt without regard to the dates of receipt 
of articles accepted for regular publication. It 
is anticipated that most. of the brief reports 
can appear in the first or second issue to go to 
press following their acceptance. 

The procedure for submitting a brief re- 
port is: 

1. The author sends the editor both the 
brief report and a more complete report of the 
study. The full report should be of sufficient 
length to give a clear account of the back- 
ground, procedure, results, and conclusions of 
the study. 


2. The full report is prepared in the style 
specified by the Publication Manual [1], ex- 
cept that it may be typed with single spacing. 
The editor will send the full report to the 
ADI. 

3. The brief report gives a clear, condensed 
summary of the procedure of the study, and 
as full an account of the results as space per- 
mits. To insure that the brief report does not 


exceed one printed page, the author prepares 
it according to these specifications : 


The text of the brief report, including all 
matter except the title and the author’s lines, 
does not exceed 80 typewritten lines averaging 
42 characters and spaces in length. Set the 
typewriter margins for short lines of 42 char- 
acters, which are 3.5 inches long in elite typing, 
or 4.2 inches long in pica. The manuscript is 
double spaced throughout and, except for the 
short lines, follows the standard style [1]. 

Headings, tables, and references are avoided 
in the brief report. If essential, they must be 
counted in the 80 lines. The brief report in- 
cludes, within its 80 lines, a footnote in this 
style: 

1An extended report of this study may be ob- 
tained without charge from John Doe, 300 Market 
St., Prospect 6, Mass. [giving the author’s full name 
and address] or for a fee from the American Docu- 
mentation Institute. To obtain it from the latter 
source, order Document No. .... from ADI Auxiliary 
Publications Project, Photoduplication Service, Li- 
brary of Congress, Washingtén*2$, D. C., remitting 
$........ for microfilm or $....... for photocopies. 


4. The author of a brief report prepares at 
least 100 mimeographed copies of the extended 
report. He agrees: (a) to send copies of the 
extended report upon request as long as the 
supply lasts, and (4) not to submit the ex- 
tended report for publication to another 
printed journal of general circulation. 


Reference 


1. American Psychological Association. Council of 
Editors. Publication manual of the American 
Psvchological Association. Psychol. Bull., 1952, 
49, 389-449. 
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Ranking Bellevue Subtest Scores 
for Diagnostic Purposes 


Joseph Jastak 


University of Delaware? 


One of the features of the Bellevue scale is 
the diagnostic significance of rank patterns of 
the 11 subtests. Wechsler [4] has found that 
the pattern of sigma deviations of individual 
subtest scores from the mean is to some extent 
related to psychiatric disease entities. His 
schizophrenics, for example, rank high in in- 
formation and vocabulary. They rank low in 
picture completion and similarities. His psy- 
chopaths rate high in object assembly and pic- 
ture arrangement, low in information and 
arithmetic. Wechsler gives sign lists for neu- 
rotic, schizophrenic, organic, psychopathic, and 
mentally defective patients [4, pp. 150-152]. 
Clinical impressions substantially agree with 
the ability patterning reported by Wechsler. 
However, there is no objective way of applying 
his sign lists to individual case records. If an 
objective method were applied, the signs would 
not withstand the rigors of clinical and experi- 
mental testing. 


This study was undertaken to find the pos- 
sible reasons why the sign lists fail. We assume, 
along with Wechsler, that ability ranks may 


1This study is a by-product of more extensive re- 
search on mental diagnosis undertaken with fi- 
nancial aid from the Kate Jackson Anthony Foun- 
dation of Lewiston, Maine. The Foundation’s gener- 
ous support is here gratefully acknowledged. 

The numerous calculations were facilitated by 
the use of a Monroe automatic calculator which was 
made available to the author through the efforts of 
the Rev. H. E. Hammond of Old Swedes Church 
and the Christina Community Center, Wilmington, 
Delaware. 

A wodification of this paper was presented at the 
annual meeting of the Delaware Psychological As- 
sociation on October 13, 1952. 


2The psychometric data for this study were as- 
sembled and tabulated while the author was chief 
psychologist at the Delaware State Hospital and 
Mental Hygiene Clinic, Farnhurst, Delaware. 
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provide a sound basis for the study of person- 
ality differences [1]. We also assume that such 
difierences are measurable, and that their 
measures may be clinically useful. 


Problems and Methods 


‘The measurement of subtest dispersion has 
two aspects: (a) the rank orders in which 
measured abilities occur and (4) the extent 
to which their scores deviate from a particular 
reference point. Our study is confined to the 
first of the two aspects. To measure rank or- 
ders and their patterning, we have found the 
rank-difference correlation method (rho) most 
convenient. The rho method has a number of 
advantages. It avoids dealing with plus and 
minus entries. Its coefficient directly represents 
the multiple relationships between subtests 
without intervening reference points like the 
mean or vocabulary scores. The ranks are in- 
dependent of ability levels. Inferior and supe- 
rior intellects may have similar or identical 
ranks. Ranking is also independent of the de- 
gree of discrepancies between scores which can 
be measured by other means. The rho makes 
the comparison of individual records with each 
other or with those of a group a relatively 
simple matter, and it facilitates cross valida- 
tion. Its probable error is fairly well estab- 
lished. | 

Some years ago, we analyzed the Wechsler- 
Bellevue records of several population 
samplings by the rho method. Police and nurse 
applicants, state hospital employees, and neu- 
rotic and psychotic patients of all types were 
sampled in fair-sized groups. The obtained sub- 
test ranks were all highly correlated. None of 
the rho’s between group patterns was lower 
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Joseph Jastak 


Table 1 


Correlation Coefficients (rho’s) of Subtest Ranks of Bellevue Weighted Score Means for 





Seve... Diagnostic Groups Studied by Rapaport 
Groups N 1 2 3 4 5 6 7 
1. Unclassified schizophrenics 37 76 76 77 66 68 68 
2. Paranoid schizophrenics 26 66 87 76 76 73 
3. Paranoid condition and 
simple schizophrenics 22 76 46 77 61 
4. Preschizophrenics 32 .78 .92 83 
5. Depressives 31 .70 .67 
6. Neurotics 59 84 
7. Police patrols 54 





than .83. Ranking of subtests failed to differ- 
entiate between normal and abnormal groups 
or between any two abnormal groups. 

Then we ranked the Bellevue records of 
studies published by others and got results very 
much like own. The intercorrelations 
tended to be significantly positive with few 
exceptions. The individual coefficients varied 
more with size and homogeneity of the group 
than with the diagnosis. As an example, Table 
1 contains the correlation matrix for the 
weighted score ranks of seven diagnostic groups 
published by Rapaport [31. 

The high positive intercorrelations between 
Rapaport’s groups support our findings con- 
cerning patte:n similarity in highly different 
groups. When Rapaport’s main groups are 
split into their subcomponents, only his simple 
schizophrenics (N=9) and his maladjusted 
patrols (N =—9) deviate from the trend of high 
positive correlations. A comparison of Rapa- 
port’s seven group patterns with five of our 
own patterns gave 35 rho’s ranging from .66 
to .95, indicating that Bellevue subtest pat- 
terns tend to be similar irrespective of place, 
time, and population sampling. We consider 
the bunching of rho’s at the high positive end 
unusual, since there is no logical reason why 
they should not vary through their entire pos- 
sible range from minus one to plus one. 

If a schizophrenic and a psychopathic patient 
have antithetic ability ranks, as is often re- 
ported and assumed to be the case, then the 
rho coefficient between their score ranks should 
be high and negative. This does not appear to 
be the case. In view of such puzzling results we 
have asked ourselves the following questions. 
Is the idea of differentiating ability ranks be- 
tween diagnostic groups a false one? Are our 


our 


methods of measuring ability patterns inade- 
quate? Are external criteria of grouping un- 
reliable? Should we look for factors obscuring 
meaningful ability patterns? Is there perhaps 
something in the norms of the Bellevue scale 
that makes ability ranks so uniform? 

First, let us see to what an extent Wechs- 
ler’s own sign lists are correlated. In Table 2 
we present the matrix of rho’s for the five di- 
agnostic groups described in Wechsler’s book. 


Table 2 
(rho’s) of Wechsler’s Ranked 
Sign Lists for Five Diagnostic Groups*® 


Intercorrelations 











3 2 
© s 
P| © s © 
- 3 S at 
a g = ~ 
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Organic 37 63 .06 33 
Schizophrenic 65 -.13 3 
Neurotic ~.14 2 
Psychopathic 47 





* (4, pp. 150-152) 

In Table 2 we see that only three out of ten 
entries are significantly positive and that these 
are not as large as ours. Moreover, the psy- 
chopathic pattern is actually negatively related 
to the neurotic and schizophrenic patterns. 
Most of the rho’s are not appreciably different 
from zero which is as it should be. 

To find out why Rapaport’s and our own 
rho’s are so high, we have made five different 
analyses of the test ranks of five groups. We 
have used the Jastak-Bijou Wide Range 
Reading test [2] in addition to the Bellevue 
subtests. This gives us 12 subtests and 12 ranks 
for each group and each method of analysis. 
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‘The five groups were: (a) Delaware State 
Hospital employees (attendants, office clerks, 
matrons, occupational aides, technicians, main- 
tenance and recreation workers), (6) patients 
diagnosed as psychoneurotic, (c) patients di- 
agnosed as manic-depressive, (d) patients di- 
agnosed as schizophrenic, and (¢) patients di- 
agnosed as psychotic because of organic brain 
disease. No alcoholics were included in the 
group of organic cases. [he ranges, means, and 
standard deviations of the ages of our groups 
are shown in ‘Table 3. ‘There are 80 cases in 
each group, 40 males and 40 females, except in 
the organic group which has 40 males and 34 
females, a total of 74 cases. 


‘Table 3 


Means, Standard Deviations, and Ranges of 


Ages in Diagnostic Groups 


Diagnostic group N Range M SD 
Nonpsychotic 
Male 40 18-68 35.02 13.42 
Female +0 19-58 32.60 11.06 
Neurotic 
Male 40 19-65 32.40 11.00 
Female +0 20-64 36.7 12.39 


Manic-depressive 


Male 40 18-67 38.82 13.69 

Female 40 20-69 41.25 12.85 
Schizophrenic 

Male 40 17-47 30.25 7.26 

Female 40 20-68 33.10 9.29 
Organic 

Male 40 19-77 43.32 15.65 

Female 34 18-67 43.94 11.33 

Total Number 394 





The results of our five groups were an- 
alyzed in the following v’ays: 

l. Weighted score method. Bellevue 
weighted scores of each subtest were averaged 
for each group. The obtained averages were 
ranked according to size from highest (rank 
1) to lowest (rank 12). The resulting rank 
orders were then correlated by the rho method. 
The use of weighted scores makes the results 
entirely dependent on Wechsler’s normative 
universe without any known controls. 

2. Quotient method. The weighted score of 
each subtest for every person was multiplied 
by five. A quotient corresponding to this pro- 
duct was found in Wechsler’s IQ Tables for 
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verbal or performances tests in the appropriate 
age column. ‘The means of the quotients for 
each group were calculated, ranked, and cor- 
related as before. This method depends on 
Wechsler’s norms, but introduces age control 
” 


since quotients tend to neutralize changes due 


to the age factor. 
3. Independent sampling method with no 
calculated the 


controls. For this method we 


average raw scores on each subtest obtained 


by 1,241 individuals. Of these, 329 were state 
hospital employees, 169 psychoneurotic pa- 
tients, 181 nonpsychotic patients, 101 patients 


of unknown diagnosis, and 461 psychotic pa 
tients of all kinds. The mean raw 


ea h subtest ot 


cores on 


our five experimental groups 
were compared with the corresponding subtest 
means of the total group by sigma deviations. 
The latter were ranked in order from largest 
plus to largest minus deviation, or from largest 
plus to smallest plus deviation, or from smail- 


The re 


sulting ranks were correlated with each other. 


est minus to largest minus deviation. 


This method is independent of Wechsler’s test 
norms. 


4 


4. Independent sampling method with age 


control. In this step we divided the 1,241 cases 
into four groups according to age: 440 cases 
, = - ’ 4 , ¢c 
from 17 to 29 years, 330 cases from 30 to 39 
vears, 273 cases from 40 to 51 vears, and 198 
cases from 52 to 77 years. We computed the 


mean raw scores for each subtest and age 


group. Then we found the differences between 


these means and those of our five groups sepa- 


rated by age, and divided them bi 


the sigmas 
of the original age groups. The obtained ratios 

ranked, and with 
each other in each of the five groups. This meth- 


correl: ted 


were averaged, 


; 


independent of Wechsler’s norms and, 


thie the 


od is 
in addition, tends to control ror most 
important changes due to aging. It partly 
eliminates test differences between persons or 
groups of varying age levels and thereby em- 
phasizes the residual relationships di 
sonality differences or psychiatric illness. 

5. Independent sampling method with age 
and sex control. We next divided each of the 
four age groups under 4 into two parts ac- 
cording to sex. The mean raw scores for each 
subtest in the resulting eight groups served as 
norms from which sigma deviations were cal- 
culated. These were ranked as before and cor- 


e fo per- 
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related by the rho method. This analysis 
should neutralize any sex differences which 
might exist in ability ranks. 


Results 


In reporting results, male and female groups 
will be treated separately through all steps. 
This will permit us to evolve male and female 
patterns if such should be justified. Also, if in 
the course of the analysis, the observed changes 
are parallel for both sexes, our conclusions con- 
cerning the conditions under which pattern 
analysis is most effective will have greater 
weight than they would have if observed in 
one group only. The results of the weighted 
score method are presented in Table 4. 


‘Table 4+ 


Correlation Coefficients (rho’s) of Subtest Ranks 




















of Bellevue Weighted Scores for Five 
Diagnostic Groups* 
3 2 
E 5 
& 3 £ © 
$ E 5 F 
2 Ff = = ie 
<) Zz a 9) i.) 
Male 
Employees 77 .62 61 29 
Neurotic .80 89 65 
Manic-depressive 92 .76 
Schizophrenic 81 
Female 
Employees 12 -.18 14 07 
Neurotic 82 83 88 
Manic-depressive 89 91 
Schizophrenic 85 
* For males, N=200, 40 in each group; for females, 


N=194, 40 in each group except for 34 in organic group. 


The matrix of Table 4 has significantly posi- 
tive rho’s in 15 out of 20 cells. The ability 
ranks of female employees are apparently dif- 
ferent from those of the other female groups. 
The patterns of male employees and organic 
patients are also different. We will consider 
rho’s of .40 and below not significantly dif- 
ferent from zero. The coefficients of neurotic 
and psychotic groups for both sexes are so 
highly positive that their test ranks cannot be 
clearly distinguished from each other. 


The rho’s of the quotient method are given 
in Table 5. The values of Table 5 are general- 
ly somewhat smaller than are those of Table 4. 


Joseph Jastak 


Table 5 
Correlation Coefficients (rho’s) of Subtest Ranks of 
Bellevue Quotients for Five Diagnostic Groups 


a —EEE 
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o q 
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§ e $ i q 
c 3 ‘: 4 ie 
o Z > a ° 
Male 
Employees 72 56 34 .07 
Neurotic .76 77 59 
Manic-depressive 85 61 
Schizophrenic 83 
Female 
Employees 25 -.15 -.01 13 
Neurotic 74 73 74 
Manic-depressive 94 -74 
Schizophrenic ad 
However, they are still significantly positive 
in 14 of 20 cells. The relationship between 
male employees and schizophrenics has de- 


clined sufficiently to permit the separation of 
the two patterns. In other respects, the quo- 
tient yields similar rank orders for 
neurotics and psychotics of both sexes. 

The results of Table 6 were obtained from 
our third analysis, the independent sampling 
method with no age or sex control. The co- 
efficients of Table 6 are, with one exception, 
considerably lower than those of Table 4 or 5. 
The relationships between>employee and pa- 
tient ranks have become negative in six out of 


Ine th dd 


Table 6 
Correlation Coefficients (rho’s) of Subtest Ranks of 
Bellevue Raw Scores for Five Diagnostic Groups 











5 P 
2 ~~ < © 
2 5 Og q 
< Z = a S 
Male 
Employees 4 —.26 -A49 -.48 
Neurotic 09 24 12 
Manic-depressive Py | 56 
Schizophrenic 71 
Female 
Employees m5 | -A5 -.23 -.31 
Neurotic 54 34 21 
Manic-depressive .78 67 
Schizophrenic 88 
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eight cells. Male neurotic patterns can now be 
clearly distinguished from those of manic-de- 
pressives, schizophrenics, and organics. Female 
neurotics are distinguishable from female 
schizophrenics and organics. ‘Twelve out of 20 
entries represent different patterns (rho’s be- 
low .40). 

The results from our fourth analysis with 
age control are listed in Table 7. Age control 
has left most intercorrelations unchanged in 
comparison with those of Table 6. The most 
striking change is the reduction of rho’s be- 


‘Table 7 
Correlation Coefficients (rho’s) of Subtest Ranks of 
Bellevue Raw Scores for Five Diagnostic 
Groups with Age Control 


‘ - 
2 | 
= 4 
S 7c & ¥ 
4 t e E 
5 =| r= a = 
© rt ~ £ iL 
O Z mt a ) 
Male 
Employees 29 -11 ~.41 -.39 
Neurotic 30 38 .09 
Manic-depressive 68 17 
Schizophrenic -62 
Female 
Employees 21 ~.24 —.24 -.05 
Neurotic 51 16 .28 
Manic-depressive 52 _ 
Schizophrenic 7 


Table 8 


Correlation Coefficients (rho’s) of Subtest Ranks of 
Bellevue Raw Scores for Five Diagnostic 
Groups with Age and Sex Control 











gs 8 £ ; 
< %, = & 6 
Male 

Employees 11 —.45 -.59 -.68 
Neurotic 03 31 —.43 
Manic-depressive 42 -08 
Schizophrenic 57 

Female 
Employees -.07 —.83 -.78 -.50 
Neurotic .20 ~.13 -.06 
Manic-depressive 66 32 
Schizophrenic 48 
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tween manic-depressives and organics in both 
male and female groups. The number of dis 
tinct diagnostic ranks has been raised from 12 
to 14 as a result of age control. 

The rank correlations from our independent 
sampling with age and sex control are re- 
corded in Table 8. Sex control has increased 
the number of distinct rank orders from 14 to 
16. Only 


tive correlations. Iwo of these are of border- 


four cells out of 20 have high posi 


line significance. We can now distinguish well 
between all diagnostic 
depressives and schizophrenics, schizophrenics 


The high 


correlations between employees and 


groups except manic- 


and organics in both sex groups. 
negative 
ill psve hot groups are noteworthy. 

We investigated the possible reasons for the 
consistently positive correlations between schiz- 
ophrenics and manic-depressives and between 
‘T wo 

found that a 


large number of incipient schizophrenics are 


schizophrenics and organics. explana 


tions seem plausible. We have 


diagnosed as manic-depressive. When we sepa- 
rated the younger groups of schizophrenics 
from the older patients of the same category 
and correlated the resultant two patterns with 
that of manic-depressives, we found that the 


correlation between old schizophrenics and 


24) and 


schizo- 


manic-depressives became negative ( 


that the correlation between young 
phrenics and manic-depressives became more 
positive (.76). Psychiatrists consulted on this 
point agreed that the high relationship was not 
due to the similarity between 
phrenics and 


inaccurate 


young schizo 
but 


number of 


due to 


those 


manic-depressives 
diagnoses. A 
diagnosed as man:- or depressive turned out to 
be clear-cut schizophrenics. A possible explana- 
tion of the positive relationship between schizo 
pI 
schizophrenic patients were examined at vary 
ing periods following electric shock therapy 
which is known to change temporarily the 


irenics and organics is that many of the 


original functional pattern to an organic one. 

In Table 9 we list the ranks of the Bellevue 
subtests and the WRAT Reading test for our 
five groups according to sex. These ranks were 
obtained from the fifth analytic step, the rho’s 
of which appear in Table 8. These ranks can- 
not be obtained by using deviations expressed 
in terms of Wechsler’s weighted scores or quo- 
tients. The data in Table 9 reveal that the 
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Table 9 
Raw Score Ranks of Wechsler-Bellevue Subtests and Wide Range Reading 
Test for Five Diagnostic Groups Obtained from Independent 
Sampling of 1,241 Records, with Age and Sex Controlled 
(N==40 in each group except for 34 female organics) 

k Employees Neurotie Manic-Depressive Schizophrenic Organic 
Test Male Female Male Female Male Female Male Female Male Female 

1. Information 11 8 6 5 4 6 | 3 3 3 

2. Comprehensicn 3 7 3 1 8 8 6 10 9 6 
3. Digit Span 8 10 12 il 12 4 7 2 8.5 
4. Arithmetic 7 9 11 2 2 5 8 7 5 8.5 

5. Similarities 2 5 3 6 3 5 6 7 11 

6. Vocabulary 12 11 1 4 1 2 3 2 & 2 

7. Picture Arrangemeut 9 1 8 7 9 12 10 9 4 4 

8. Picture Completion 6 3 2 8 11 7 4 12 6 7 

9. Block Design 5 + 7 10 5 11 9 5 11 10 

10. Object Assembly 4 6 10 12 7 9 11 8 10 5 

11. Digit Symbol 1 2 + 6 10 10 12 11 12 12 

12. WRAT Reading 10 12 9 9 3 1 2 1 1 1 





WRAT Reading test holds a low rank in non- 
psychotics and gradually rises to first rank with 
increasing disorganization or severity of the 
mental condition. A similar progression from 
low to high rank occurs in the information and 
vocabulary tests, though neither of them is as 
clear-cut as that of the reading test. The digit- 
symbol, block design, and similarities tests tend 
to decline in rank with increase in mental dis- 
turbance. The digit-symbol test is as definite 
in its decline as the reading test is in its ascent. 
The remaining six subtests do not show con- 
sistent rises or declines but vary with the diag- 
nostic group. 

The rank patterns of the two sexes within 
each group are positively correlated with each 
other. There are, however, some striking sex 
differences in each diagnostic group in at least 
one subtest. These differences are, in our view, 
quite consistent and may well disturb diagnos- 
tic patterns derived from a sexually mixed 
universe. 

In order to enable psychologists to apply our 
findings to individual clinical cases, we repro- 
duce in Tables 10 and 11 the means and 
standard deviations of the raw scores of each 
subtest for ten groups separated by age and 
sex. Only eight of the ten groups (from 17 
years up) were used in our analysis, but the 
data for boys and girls between the ages of 
13 and 16 are added for the benefit of those 
who might wish to use them. The patterns of 
each diagnostic group are highly correlated at 


different ages. Thus the rank orders of neu- 
rotics between the ages of 20 and 25 are simi- 
lar to the rank orders of neurotics between 
the ages of 50 and 59. 

The facts of Table 10 or 11 may be used in 
the following manner. 1. Administer the 
Wechsler-Bellevue scale and the Jastak-Bijou 
Wide Range Reading test in full to get a raw 
score for each of the 12 subtests. 2. Subtract 
the mean of the subtest in Table 10 or 11 from 
the obtained raw score on that subtest, being 
careful to enter the column appropriate to the 
age and sex of the case. If the patient’s raw 
score is larger than the mean, the difference 
will be positive. If the patient’s raw score is 
smaller than the group mean, the difference 
will be negative. 3. Divide the resultant dif- 
ference for a subtest by the standard deviation 
in Table 10 or 11 for that subtest. 4. Rank 
the 12 deviation ratios from largest positive to 
smallest positive, or largest positive to largest 
negative, or smallest negative to largest nega- 
tive, as the case may require. 5. Correlate the 
patient’s rank pattern with each of the five 
patterns printed in Table 9, using the ranks 
corresponding to the sex of your client. 


Though our groups are small and lack cross 
validation, the largest positive coefficient will 
frequently place the patient in the appropriate 
diagnostic group. At times, all five coefficients 
will be so close to zero that none will fit the 
reported patterns. The following cautions 
should be observed in the interpretation. A 
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Table 10 
Means and Standard Deviations of Subtest Raw Scores on the Wechsler-Bellevue 
Scale and on the Jastak-Bijou Wide Range Reading ‘T¢ 
for Patients of Ages 13 through 39 
Ages 13-16 yrs Ages 17-29 yrs Ages 3 9 yr 
Male Female Male Female Male Female 
(N 110) (N 102) (N ? (N 165 (A 192 y l 

Test Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD 
1. Information 8.40 4.21 6.97 4.28 13.06 4.89 10.58 4.52 13.04 5.28 10.83 9 
2. Comprehension 7.56 2.91 7.08 3.40 1.99 4.58 8.59 3.43 10.2 3.71 8.54 3.51 
3. Digit Span 1.32 1.88 9.17 2.02 10.15 2.47 10.12 2.11 10.03 2.21 1.83 2.26 
4. Arithmetic +.87 2.31 +.16 2.2 + 2.95 ' 46 02 2.7¢ +8 
5. Similarities 7.06 33 7.35 4.78 9.81 4. 942 l + 4.74 9.70 4.77 
6. Vocabulary 13.46 4.84 12.91 5.50 19.84 23 18.72 6.14 21.11 7.93 20.99 7.49 
7. Picture Arrangement j 69 8.35 3.98 2 4.09 9.04 4.22 ) 8 4.11 
8. Picture Completion 8.96 2.41 744 2.81 9.92 2 8.29 3.02 1.60 2.89 7.97 I 
9. Block Design 17.24 7.02 14.10 8.15 19.84 8.57 17. g 17.43 7.42 15.04 I 
10. Object Assemb]) 16.74 3.86 15.17 4.94 17.1 $40 16.09 4.52 15.70 4.78 15.72 4.74 
11. Digit Symbol! 30.98 8.84 35.60 10.83 35.26 12.06 36.13 12.26 30.02 12 6 144 
12. WRAT Reading 2.37 20.89 56.91 19.66 74.39 23.04 75.80 19.10 6.58 23.34 ' + 


the nonpsychotic order 
does not necessarily mean normalcy. Our rank 


high correlation with 


patterns do not take into consideration the de- 
of test scores 
which are very important in judging the dis- 
turbance of the individual. Electric shock and 
other treatments may radically change ability 
patterns. The psychiatric diagnosis has a large 
margin of error, especially at the younger 
age levels. There are many other ability pat- 
terns which are not included in this study. 


gree displacements between 


Each major group, like the neurotic, appears 
to have two or more subpatterns which are di- 
agnostically important. The patterns we list 
may therefore be combinations of several sub 


patterns. 


Summary and Conclusions 


We ar 


asked in the beginning of this paper. Our re 


* now ready to answer the questions 


sults generally confirm the correctness of the 


assumption that ability patterns and psychiatri 


Table 11 


Means and Standard Deviations of Subtest Raw Scores on the 


W echsler-Be 


Scale and on the Jastak-Bijou Wide Range Reading Test 





Ages 40 — 51 yrs. 





for Patients of Ages 40 through 82 











Mal Female Female 
(N=—171 (N=—102 N=—114 \ 4 
Test Mean SD Mean SD Mean SD Me ) 
1. Information 12.32 5.26 10.42 5.03 11.04 7 9.33 4.78 
2. Comprehension 9.79 3.66 8.28 3.51 8.94 76 23 3.76 
3. Digit Span 9.80 2.19 9.68 2.34 9.35 2.22 1.39 2.28 
4. Arithmetic 6.67 2.87 5.14 2.63 66 2.8 4.82 2.42 
5. Similarities 8.35 4.76 8.70 442 6.52 4.27 7.43 4.14 
6. Vocabulary 19.89 8.11 19.45 7.08 17.99 7.44 19.65 7.13 
7. Picture Arrangement 6.42 3.67 5.74 3.51 09 32 4.24 3.19 
8. Picture Completion 9.06 3.02 7.17 2.67 7.68 3.28 6.7 2.6 
9. Block Design 14.87 7.19 12.85 7.33 10.73 7.12 1 6.59 
10. Object Assembly 15.09 4.78 13.67 5.12 12.84 25 12.79 5.18 
11. Digit Symbol 25.15 11.81 24.35 12.30 18.14 9.45 19.40 10.55 
12. WRAT Reading 71.02 25.20 77.03 22.03 68.57 


24.90 73.12 26.35 
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diagnoses are positively related. The methods 
of measuring psychometric patterns are fairly 
adequate, though improvements will further 
crystallize the value of such measures and 
stimulate research that is likely to objectify the 
diagnosis. The psychiatric diagnosis as an ex- 
ternal criterion of comparison has a large mar- 
gin of error, but it can provide us with trends 
which are usable for normative purposes if 
cross validations are made to establish and ex- 
plain the size and nature of the error. Two 
variables which interfere with ability pattern- 
ing in diagnosis are age and sex. These vari- 
ables should be controlled whenever rank an- 
alvsis of subtests is undertaken. The most 
serious obstacles to the derivation of diagnostic 
ability ranks are distortions inherent in the 
norms of the Bellevue scale. Constant errors 
seem to push up some subtest scores and pull 
down others, and thereby obscure differences 
due to mental characteristics of the group. The 
number and nature of such errors can only be 
guessed at. Changes in difficulty levels since 
standardization, changes in materials, in di- 
rections of administration and scoring, biased 
population samplings in ‘regard to personality 
traits, clerical errors, and varying proportions 
of sex and age samplings contributing to norms 
may be associated with the observed artifacts. 
With respect to Wechsler’s sign lists the 
following conclusions seem warranted: 


|. The diagnostic signs hold up only when 
raw score means are used for purposes of com- 
parison. Conversion of raw scores into 
weighted scores or quotients distorts subtest 
patterns. 

2. His sign lists fit our female patterns 
more closely than they do our male patterns 
for each of the comparable groups. 

3. Wechsler’s neurotic, schizophrenic, and 
organic patients are inadequately differentiated 
from each other cither in the psychiatric diag- 
nosis or in the test results, or in both. 

4. It is important that the patterns pub- 
lished in this paper be checked carefully on sev- 
eral large and independent groups of each di- 
agnostic category to test both the reliability and 
validity of the ranks. 


Received March 2, 1953. 
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Interest in pathological performance on the 
Wechsler-Bellevue has been so great that, ex- 
cept for a few studies [2, 8], research with 
normal superior adults has been neglected. Sex 
differences have also received little attention, 
except for the item analysis by Jastak [6] and 
a recent note by Strange and Palmer [11 |. It 
is the purpose of the present study to investi 
gate with young superior adults a number of 
problems with These 
are sex differences as related to deterioration, 


clinical implications. 
clinical 
“signs,” and scatter, as well as item difficulty 
and the problem of ceiling. 


verbal-performance differentials, 


Subjects and Procedure 


The subjects were 85 males and 68 females, 
volunteers for testing, with no known pathol- 
ogy. All cases in the author’s files with a full 
scale IO of 120 or better were used. According 
to Wechsler [12], they may be categorized as 
superior or very superior, constituting the top- 
most 8.9% of the standardization population. 
Age, so important in W-B scatter [5], was con- 
trolled so that no subject was younger than 
older than 29. Data on age 
males, range 18-29, mean 23.5; females, range 


15 or were: 
15-29, mean 22.3. The majority were college 
students, data on years of education being as 
follows: males, range 12-19, mean 14.4; fe- 
males, range 10-18, mean 13.9. All tests were 
scored according to directions by Wechsler 
[12] or the supplementary manual by Kitzin- 
ger and Blumberg [7]. In order to test sex dif- 
ferences, it is important that full scale 1Q’s be 
reasonably equated. Table 1 presents data on 
IQ distributions, showing that matching was 
quite close. Full scale IQ statistics are: males, 
mean 127.5, SD 5.2; females, mean 126.8, 


SD 5.1. 


points | 


‘The ¢ of the 


statistically insignificant 


difference of 0.7 IQ 
being 0.91. 


Table } 


Distributions of IQ's 
* } . - 
1Q Group Ma Female 
A ox a ‘ 
V , V 
120-124 28 32.9 4 
125—129 28 32.9 20 29.4 
130-134 18 21.2 l 22.1 
135-139 9 10.6 i 9 
140-144 2 24 2 9 
Total RS 190.0 42 
Results 
; 7 ,; 
Sex differences and deterioration. Accord 


ing to Table 2, there is a highly significant dif 
adult males and fe 
males in Deterioration Quotient (DQ). The 


male mean DO is 99.6, which is what is to be 
i for their age group. 


ference between superior 


anti ipated 
DO mean 


pe ted for aves 


gr The female 
is 95.0, which is the DQ to be & 
35-39 [12]. Thus our young 
tiles have a DQ which Wechs 


ler says applies to an age group some 15 years 


very bright fen 


older. As a matter of fact, 17.6% of these fe 
males have a DQ of 84 or less, which is what 
Wechsler gives as figure for ages 55-59, where- 


Table 2 


Quotients of Superior Adult Males and Females 








Ouotients iaies re 11¢¢8 

‘ M a M o Diff f 
Deterioration 99.6 10.7 95.0 10.7 46 2 Se 
Verbal 126.3 7.0 122.7 6.7 3.6 3.199 
Performance 123.4 63 25.7 6.2 2.3 2.15° 
Full scale IQ’s 127.5 5. 26.8 5.1 0.7 0.91 
® Sionificant at the 5% levee 
** Significant at the 1% level. 
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as only 5.9% of males have a DQ which is as 
low. The ¢ of this difference of 11.7% is 2.18, 
significant at the 5% level. 


Sex differences and verbal-performance dif- 
ferential. Not only is there an important sex 
difference in DQ, but Table 2 also re- 
veals that there are highly significant sex dif- 
ferences in verbal IQ and performance IQ, 
although full scale IQ’s are almost identical. 
Males are significantly higher in verbal (V), 
females in performance (P). There is also a 
significant difference between V and P with- 
in the sexes, the male VQ being 2.9 points 
higher than the male PQ, the? of this dif- 
ference (using the r term) being 2.74, signifi- 
cant at the 1% level. The female VQ is 3.0 
points higher than the female PQ, the ¢ of this 
difference (using the r term) being 2.78, also 
significant at the 1% level. Wechsler states, 
“Subjects of superior intelligence generally do 
better on the verbal, and subjects of inferior 
intelligence do better on the performance part 
of the examination. There are also racial 
(group) and cultural differences” [12, p. 
147]. It might be added that, according to 
evidence of this study, there are sex differences 
as well. The superior adult male does better in 
verbal, whereas the superior adult female does 
better in performance. Of interest here is that 
Strange and Palmer [11], reporting on un- 
equated groups near average, also find males 
better in V than P, whereas females are better 
in P than V. 


Wechsler [12] also states that for subjects 


Raiph D. i 


Norman 


with 1Q’s not far from average, a variation of 
8 to 10 points between V and P in either di- 
rection is within normal range. Using his up- 
per limit as important, we find that 24.8% of 
males, but only 5.8% of females have an ex- 
cess of 10 or more points of V over P. This 
difference of 19.0% is highly significant, the ¢ 
being 3.44, significant at the 1% level. Among 
males, 10.6% have an excess of 10 or more 
of P over V, whereas 22.1% of females 
show the ¢ of the difference of 
11.5 is 1.88, significant between 5 and 10% 
levels. 


points 


this excess; 


Sex differences and subtests. Since DQ’s, 
VQ’s, and PQ’s differ significantly between 
sexes, do outstanding subtest differences ac- 


count for these results? Table 3 presents data 
for the 11 W-B subtests, demonstrating that 
males are significantly superior in Arithmetic, 
females are significantly superior in 
Digit Symbol. The vastly greater inferiority 
of females in Arithmetic is what causes their 
lower DQ’s, since Arithmetic is a “don’t hold” 
test; it also contributes to their inferiority in 
VQ. Their superiority in Digit Symbol, plus 
slight superiorities in all other performance 
tests results in a significantly higher PQ. Jas- 
tak [5] says that males tend to be better in 
Information, Picture Completion, Object As- 
sembly, Arithmetic, and Digit Span, whereas 
females tend to be better im Vocabulary, Digit 
Symbol, Comprehension, Picture Arrange- 
ment, and Block Design. According to Table 
3, the present data are in agreement with Jas- 


whereas 











Table 3 
Subtest Weighted Scores of Superior Adult Males and Females 
Males Females 
Subtest M og M o Diff. t 
Information 13.8 1.4 13.6 1.7 0.1 0.58 
Comprehension 13.8 1.8 13.8 2.0 0.0 0.02 
Digit Span 12.3 3.0 11.7 2.6 0.6 1.27 
Arithmetic 14.2 2.5 11.8 2.8 2.4 5.54*° 
Similarities 14.9 1.8 14.9 1.5 0.0 0.12 
Vocabulary 13.6 1.5 13.9 1.4 0.3 1.30 
Pic. Arran. 12.9 2.5 13.4 2.6 0.5 1.22 
Pic. Comp. 13.6 1.2 13.9 1.2 0.3 1.63 
Block Design 14.3 1.8 14.7 1.7 0.4 1.36 
Object Assem. 13.2 1.8 13.3 1.7 0.1 0.33 
Digit Symbol 13.0 1.9 13.7 1.8 0.7 2.20° 





* Significant at the 5% level. 
** Significant at the 1% level. 
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Table 4 
Significant Differences (t's) between Means of Subtests of Superior Adult Males 
Infor- Digit Pic. Pic. Block Object 
mation Compr. Span Arith. Simil. Vocab. Arran. Comp. Design Assem 
Comprehension x - 
Digit Span 4.16*%* 3.91** - 
Arithmetic x x 4.79** - 
Similarities 5.27** 4.67** /7/./0** 2.47* - 
Vocabulary x x 4#.13** x 6.50** - 
Pic Arran. 2.719* 2.64°° x 3.199 5.56%*  2,13° ~ 
Pic. Comp. x x 3.81** x 5.98** x 2.25* 
Block Design 2.63**  2.16* 5.50** x 2.77**  4.89** 4.35° 4.07** 
Object Assem. 2.76** x 2.28* 245°" 6.35** x x 6.15°° 
Digit Symbol 2.91** 4,31** x 3.149" 6.68°*  2.26° x 2.44" or* , 
x no significant difference wo 
Numbers in italics indicate that row subtest means are greater thar ‘ 
significant at the 5% level. 
** Significant at the 1% level. 
tak in all cases except Picture Completion, Ob- considerable variability within sexes in terms 


ject Assembly, and Comprehension. 

Sex difference and “hard signs.’ Wechsler 
[12] states that for total scores beyond the 
limits 80-110, one should divide the mean sub- 
test score by 4 to define a significant difference 
from the mean. This was done, and again the 
Arithmetic test strongly differentiated between 
the sexes. The data for significant deviations 
are as follows: positive—M, 8; F, 1; neither 
positive nor negative—M, 69; F, 45; negative 
—M, 8; F, 22. The 7’ of these differences is 
15.34, significant at the 1% level. Females 
apparently have a larger number of negative, 
males of positive significant deviations. 

Sex differences and intertest scatter. Ex- 
amination of ‘Table 3 indicates that there is 


of subtest means. Since the assumption in clin- 
ical usage of the W-B is that in normal adults 
there should be comparatively little variation 
umong subtest scores, it is important to note 
whether ap 


differences in subtest means. Accord- 


high intelligence and sex make 
preciable 
ingly, significance tests were run (using the r 
terms), between subtest means for each sex 
separately. Table 4 yields results for superior 
adult males, and Table 5 for females. ‘These 
tables are noteworthy for the very high num- 
ber (far beyond chance) of statistically signifi- 
cant differences which are obtained between 


certain subtests and others. 


Typical superior adult male scatter would t 


indicate V 


oniy 














greater than P, as noted above, but 
Table 5 
Significant Differences (t’s) between Means of Subtests of Superior Adult Females 
Infor- Digit Pic. Pic. Block Object 
mation Compr. Span Arith. Simil. Vocab. Arran. Comp. Design Assem. 
Comprehension x - 
Digit Span 5.339° 5.16%° ~ 
Arithmetic 4.68** 4.83%* x - 
Similarities 5.75** 3.56°*  8.179* 8.50%* - 
Vocabulary x x 6.32** 6.25%* 5.14°° - 
Pic. Arran. x x 3.73** 3.51%* 3.979* x - 
Pic. Comp. x x 6.67** 5.46°* 442** x x ~ 
Block Design 3.96**  2.86** 8.32** 7.96** x 2.9998 3.149* 3.06** - 
Object Assem. x x 3.9990 3.999* 649°* 2.59%* x 2.68%"  4.969° - 
Digit Symbol x Y 6.04** 5.00** %3.919* x Y x 3.50%° x 
nom - —__—_—_—_—_ — _— 
x=no significant difference. 
Numbers in italics indicate that row subtest means are greater than column means 


* Significant at the 5% level. 
** Significant at the 1% level. 





414 


also the following (Table 4): Digit Span signifi- 
cantly lower than 8 other subtests (all except Pic- 
ture Arrangement and Digit Symbol) ; Similarities 
significantly higher than all 10 other subtests; Pic- 
ture Arrangement significantly lower than 7 others 
(all except Digit Span, Object Assembly, and Digit 
Symbol) ; Block Design significantly higher than 8 
others (all except Arithmetic and Similarities) ; and 
Digit Symbol significantly lower than 7 others (all 
except Digit Span, Picture Arrangement, and Ob- 
ject Assembly). These results are comparable with 
studies by Estes [2] and Merrill and Heathers [8] 
on other college males. Using deviations from Vo- 
cabulary, Estes found negative deviations in Digit 
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Span, Picture Arrangement, and Digit Symbol, posi- 
tive deviation in Similarities, and zero deviation in 
Block Design. With the same scatter measure, Mer- 
ril and’ Heathers found negative deviations in Digit 
Span and Digit Symbol, positive deviations in Simi- 
larities and Block Design, and practically zero de- 
viation in Picture Arrangement. 

Typical superior adult female scatter is even 
more pronounced. P is not only greater than V, but 
the following are also shown (Table 5): Digit 
Span significantly lower than all other subtests ex- 
cept Arithmetic; Arithmetic significantly lower 
than all others except Digit Span; Similarities sig- 
nificantly higher than all others except Block De- 














Table 6 
Significant Sex Differences in Items,for Superior Adults 
ts Agrees 
Test Item Yo Success with 
M F Diff. (%) t Superiority Jastak [6] 
Information 15. Population 81.1 53.0 28.1 4.77°* M Yes 
16. Wash. birth. 57.6 75. 17.4 2.28* F Yes 
18. Egypt 95.3 $5.3 10.0 2.00* M Not found 
Arithmetic Problem 6 98.8 86.8 12.0 2.82** M Not found 
Problem 8 89.4 75.0 14.4 2.29* M Yes 
Problem 9 M Yes 
(0 sols.) - 29.4 51.5 22.1 2.80** 
(2 sols.) 55.3 28. 27.3 3.54** 
(3 sols.) 20.0 8.8 11.2 1.99* 
Problem 10 M Yes 
(3 sols.) 29.4 58.8 29.4 3.77** 
(2 + sols.) 50.6 28.0 22.6 3.02** 
(3 sols.) 23.5 5.9 17.6 3.20** 
» 
Similarities 10. Poem-Statue 
(2 sols.) 61.2 78.0 16.8 2.28* F Yes 
Vocabulary 23. Vesper 73.1 87.3 14.2 2.12* F Yes 
28. Ballast 84.6 60.3 24.3 3.28** M Yes 
30. Spangle 59.0 87.3 28.3 3.999%° F Yes 
Pic. Arran. 4. Flirt F Yes 
(3 sols.) 71.7 88.3 16.6 2.61** 
(JANET) 36.5 54.4 17.9 2.25* 
(AJNET) 22.3 10.3 12.0 2.08* 
Pic. Comp. 11. Image 81.1 92.7 11.6 2.18* F Not found 
13. Thread 81.1 63.3 17.8 2.45* M Yes 
14. Brow 50.6 85.3 34.7 4.96** F Yes 
Block Design Item 2 
(6 sols.) 7.1 19.1 12.0 2.17° F Yes 
Item 4 
(4 + sols.) 74.1 91.2 17.1 2.22%" F Yes 
Object Asser. Profile 
(10 sols.) 14.1 38.3 24.2 3.45%* F Not found 





* Significant at the 5% level. 
** Significant at the 1% level. 
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sign; and Block Design significantly higher than 
all others, except Similarities. Taking both sexes 
together, Similarities and Block Design are sig- 
nificantly high while Digit Span is significantly 
low. 


Sex differences in item success. Jastak [6] 
suggests that item analyses for sex differences 
be made by others either to confirm or to deny 
his results. Table 6 gives data on significant 
superior adult sex differences in various scale 
items. Also indicated is whether results agree 
generally with Jastak.’ 


Again, there is strong male superiority in Arith- 
metic items, especially problems 9 and 10. For the 
scale as a whole, 10 items favor females, 8 favor 
males. This difference is not as overwhelming as 
that found by Jastak (32 favoring females, 18 
males). Generally though, where we have found 
significant differences, we agree with him. We find 
results he did not, in that more males than females 
succeed with “Egypt” in Information and with 
problem 6 in Arithmetic, while more females suc- 
ceed with “Image” in Picture Completion, and 
“Profile” in Object Assembly. An interesting dif- 
ference not mentioned by Jastak is that a signifi- 
cantly greater percentage of females give JANET 
solutions to “Flirt” in Picture Arrangement, where- 
as a significantly greater percentage of males give 
AJNET solutions. Males are penalized here since 
the former solution carries 3 points of credit, the 
latter 2 points. 

A word might be said about the Digit Span test. 
Jastak [6] indicates that this test is loaded with fe- 
male superiority. He finds, e.g., that females are 
significantly better in Forward 7, and in Backward 
4, 5, 6, and 7. We find no such differences, and if 
anything, slight but consistent superiority for males. 

Jastak states, too, that females are quicker in 
executing Block Designs. We find this to be true. 
The mean number of points earned by females is 
greater in 5 of the 7 problems (all except problems 
3 and 6) and they are superior in the types of 
solutions for problems 2 and 4, as given in Table 
6. 


Order of difficulty of items. Order of diffi- 
culty is, of course, not directly comparable be- 


10Our data are not directly comparable with Jas- 
tak’s for several reasons: (a) we have a much 
small r N (153 ws. 1,176); (b) we used only 120 
IQ or better, whereas Jastak used a much wider 
IQ range; (c) we made a number of statistical tests 
not performed by Jastak, who defined success with 
an item as any success, whereas we tested for dif- 
ferences at different levels of success; (d) Jastak 
gives no indication of having matched his IQ’s as 
carefully as we did. For example, 26% of his fe- 
males were nurses, who might be presumed to be of 
above average intelligence. 
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tween our superior adults and either Wechs- 
ler’s [12] or Jastak’s [6] group. However, in- 
terest attaches to which items cause particular 
difficulty in a highly intelligent group, since 
less intelligent groups would generally also ex- 
perience such difficulty. Over-all success was 
calculated by averaging male and female suc- 
cesses with each item. 


In the Information test, the following items are 
relatively harder for our superior subjects than 
their regular positions indicate: “Pints” (our rank 
order 13, regular rank order 5) ; “Height” (ours 16, 
regular 9); “Paris” (ours 17, regular 12); and 
“Pole” (ours 22, regular 17). The following are 
relatively easier: “Brazil” (ours 5.5, regular 11) ; 
“Hamlet” (ours 5.5, regular 14); “H. Fian” 
12, regular 19); and “Vatican” (ours 15, regular 
20). The last 5 items in the test are apparently ap- 
proximately correctly placed. Jastak makes the 
“Pints” question number 4 in order, but we suspect 
it is more difficult than this. He 
and “Hamlet” unchanged, moves ‘Paris’ and 
“Pole” down and “Brazil” and “H. Finn” and 
“Vatican” up. The strong influence of schooling is 
shown in our group in that “H. Finn” and “Ham- 
let” are easier than “Pints” or “Height.” 
ranking correlates .855 with Jastak, .834 
Wechsler. 

In Comprehension, scoring 0, 1, or 2 credits, and 
taking mean points, we find “Shoes” misplaced for 
our superior adults, seemingly being difficult (our 
rank 10, regular 5). Intelligent subjects find it 
hard to give 3 reasons in order to earn 2 points on 
this item. If Wechsler persi:‘s in this type of scor- 
ing he should move the item farther down. Eglash 
[1] also mentions this problem with “Shoes.” The 
“Deaf” item is relatively easy for our group (our 
rank 5, regular 10), which may be accounted for by 
superior education, the “learning” criterion for 2 
points being quickly achieved. We do not entirely 
agree with Jastak when he says that Comprehen- 
sion has a dearth of items at upper levels of dif- 
ficulty. His method of scoring success (i.e., any suc- 
cess) indicates this; our adherence to point values 
does not since “Shoes,” “Laws,” and “Marriage” 
are closer to a mean of 1 than 2 points for our 
subjects. The p's are .479 with Jastak, .645 with 
Wechsler. 

Using mean point values again, we find that in 
Similarities the “Air-Water” and “Wood-Alcohol” 
items are relatively harder for superior adults; 
“Poem-Statue” is relatively easier. The former rank 
order 10 and 12 respectively as against regular 
positions of 6 and 7; the latter ranks 6, regularly 
10. Jastak moves “Wood-Alcohol” down, but 
leaves the others unchanged. Our rank order corre- 
lates .727 with Jastak, .671 with Wechsler. 

In Arithmetic, practically 100% success is 
achieved for the first 5 problems and for problem 7 
by our group. Problem 6 is more difficult for fe- 


(ours 


leaves “Height” 


Our 


with 
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males, and if both sexes are considered, ranks after 
problem 7. We thus agree with Jastak on the 
greater difficulty of problem 7 over problem 6. 
Problems 9 and 10 are of about equivalent diffi- 
culty for males, but not for females, the 10th being 
harder. The p’s are .936 with Jastak, .864 with 
Wechsler. 


On the Vocabulary test, the following are rela- 
tively harder for our group: “Fable” (our rank 
order 20.5, regular 15); “Brim” (ours 20.5, reg- 
ular 16); “Armory” (ours 23, regular 14); “Ves- 
per” (ours 29, regular 23); “Ballast” (ours 32, 
regular 28); “Dilatory” (ours 40, regular 36); and 
“Amanuensis” (ours 41, regular 37). The follow- 
ing are relatively easier: “Plural” (ours 8, regu- 
lar 18); “Nitroglycerine” (ours 8, regular 20) ; 
“Microscope” (ours 8, regular 22); ‘“Hara-kiri” 
(ours 22, regular 34); “Aseptic” (ours 36, regular 
40). We are in almost complete agreement with 
Jastak who has moved all our “harder” words 
(except “Brim” and “Dilatory”) down, and all our 
“easier” words (except “Plural’) up. Certainly 
words like “Dilatory” and “Amanuensis,” on which 
only 16.4% and 6.0% of our group are successful, 
belong down at the very bottom of the list right 
before “Traduce.” The p’s are .933 with Jastak, 
.917 with Wechsler. 


For Picture Arrangement, order of difficulty is 
as Wechsler has them. Practically 100% success is 
achieved on the first 3 items by our superior adults. 
Items 4, 5, and 6 are sufficiently difficult for them, 
since 80.0, 23.5, and 10.6%, respectively, achieve 
perfect solutions. 

The Picture Completion test again reveals more 
agreement with Jastak than Wechsler. “Diamond” 
and “Leg” each rank 10.5 in our group, but 4 and 
5 in Wechsler’s regular order; “Stacks” is also 
relatively harder for superior adults, ranking 13 
as against regular rank 7. Items which are easier 
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for our group are “Hand,” (ours 4, regular 9) ; 
“Water” (ours 2, regular 10); and “Shadow” 
(ours 8, regular 15). Jastak moves all our harder 
items down, and all the easier (except “Water” 
which he leaves unchanged) up. The correlations 
are .785 with Jastak, .517 with Wechsler. 

In Block Design, the 4th item is especially difh- 
cult for our superior adults, but the 5th and 7th 
are easier. Rank order of the 7 items in our group 
is 1, 3, 4, 6, 2, 7, 5. Jastak agrees substantially, 
moving No. 4 down, No. 5 up, but leaving No. 7 
unchanged. It would seem that some revamping of 
time credits is needed in this test if the present 
order is maintained. The p’s are .786 with Jastak, 
.643 with Wechsler. 

Over all, we find that Jastak [6] is essen- 
tially in agreement in his item rearrangement, 
even though we base conclusions on a highly 
selected group. Our p’s with Jastak are higher 
than our p’s with Wechsler in Information, 
Similarities, Arithmetic, Vocabulary, Picture 
Completion, and Block Design, and higher 
with Wechsler only in Comprehension. 

Ceiling of subtests. A unique opportunity is 
afforded when studying superior IQ’s to ex- 
amine discriminative ceilings of subtests. With 
153 subjects, we contrasted lowest and highest 
quarters, or IQ’s of 120—123 and 131—144 
respectively. The lower group N was 44 (to 
include all 11 subjects of 123 IQ), with 24 
males (54.5%) and 20 females (45.5%). The 
upper group N was 38, with 23 males 
(60.5%) and 15 females (39.5%) ; they are 
definitely in Wechsler’s $12] very superior 
classification. The two groups thus have com- 
parable sex percentages. Table 7 presents com- 


Table 7 


Comparative Subtest Ceilings 


Lower vs. Higher Superior Adult IQ’s 























IQ 120-123 IQ 131-144 
Subtest (N == 44) (N = 38) 
Myis o M.. a 75% Diff. t 

Information 13.0 1.4 14.5 3.2 1.5 2.75% 
Comprehension 13.0 1.6 14.6 1.8 1.5 3.97%* 
Digit Span 10.6 2.2 13.8 2.2 3.2 6.44** 
Arithmetic 12.1 2.1 14.8 2.5 2.7 5.24** 
Similarities 14.3 1.6 16.1 1.3 1.8 5.53** 
Vocabulary 13.1 1.4 14.8 1.2 1.7 §.719° 
Pic. Arran. 12.5 2.4 13.9 2.7 1.4 2.46* 

Pic. Comp. 13.4 1.1 14.1 0.9 0.7 3.19% 
Block Design 13.5 1.6 15.6 1.4 2.2 6.46** 
Object Assem. 12.5 1.8 143 1.2 1.8 5.18** 
Digit Symbol 12.6 1.8 13.9 1.8 1.3 3.19% 





* Significant at the 5% level. 
** Significant at the 1% level. 

















Young Superior Adult Performance on the Wechsler 


parative subtest ceilings, 


groups. 


contrasting both 


According to Table 7, Block Design, with a ¢ of 
6.46, is the best discriminating test. Wechsler holds 
it in high respect, saying that “it conforms to all 
statistical criteria for a good test” [12, p. 97], and 
reporting that it correlates very well with the scale 
as a whole. Digit Span (t of 6.44) ranks next in 
discriminatory power, surprisingly, since Wechsler 
says, “Memory for digits correlates with general 
intelligence up to a certain point, but beyond the 
ability to repeat 6 or 7 digits forward, its correla- 
tion becomes negligible” [12, p. 74] and also, “A 
good rote memory is of practical value but cor- 
relates very little with higher levels of intelligence” 
[12, p. 85]. We cannot agree with him since the 
test still is sharply discriminating between superior 
and very superior intelligence. Vocabulary, Simi- 
larities, and Arithmetic (?/’s of 5.71, 5.53, and 5.24 
respectively) hold up as discriminators quite well, 
and we agree with Wechsler who holds them in 
high esteem. Object Assembly, like Digit Span, is 
somewhat surprising (f of 5.18) since Wechsler 
feels that formboard-type tests “at the high levels 
have very little discriminative value” [12, p. 97]. 
Two other tests Wechsler rates lowly as measures 
of general intelligence are Picture Completion and 
Picture Arrangement. He says that Picture Com- 
pletion is relatively inadequate in discriminating 
between higher intelligence (i.e., high 
average and superior). We find a comparatively 
low t of 3.13 for this test, although it should be 
mentioned that a low ceiling was built into it by 
Wechsler himself since no more than 15 points may 
be earned on it. Our lowest ¢ is for Picture Ar- 
rangement (2.46), which Wechsler reports as cor- 
relating poorly with other tests and with the scale 
as a whole. Information, a test Wechsler respects, 
is also comparatively low (tf of 2.75), but it may 
not be as poor as it seems in view of the high de- 
gree of education in both our contrasted groups. 


levels of 


Generally, however, although ?¢’s between a 
supe~ior and very superior group vary widely, 
all tests retain their discriminatory powers 
quite well, even those about which Wechsler 
has misgivings. 


Discussion 

Clinical implications of this study are sever- 
al. Sex differences are great enough to cause 
significant discrepancies in DQ’s and verbal- 
performance differentials, as well as in “hard 
signs.” The clinician examining the older, 
brighter female may not know how much de- 
terioration is a function of ege or how much 
of sex, since in our young, maximally function- 
ing females a loss equivalent to 15 years of 
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age is already apparent in DQ. The V-P dif- 
ferential, of which Wechsler makes much as a 
clinically important sign, is also affected by 
sex differences, as is the “hard sign” for Arith- 
metic. Also, typical scatter patterns exist for 
both superior females and males, independent 
of pathology. Other clinicians, such as Gar- 
field [4], Schofield [9], French and Hunt 
[3], and Schnadt [10], have already men- 
tioned the import of factors outside of pathol- 
ogy as instrumental in scatter. Even with our 
highly selected group, there is strong suspicion 
that item difficulty is not quite as Wechsler 
has given it, since all but one of our rank or 
ders correlate higher with Jastak than with 
Wechsler. Jastak [6] feels that accuracy in 
item scaling is essential in order to derive clin- 
ically meaningful measures of intratest scatter- 
ing, as well as to provide for economy, ease, 
reasonable limits, and accuracy of testing. The 
matter of ceiling is important for determina- 
tion of discriminative values of tests between 
higher intellectual levels. As noted above, 
tests like Digit Span and Object Assembly are 
perhaps not as poor high level discriminators 
as Wechsler believes. 


Summary 


Using a group of young adults, all with 120 
IQ or better on the Wechsler-Bellevue, and 
with sexes matched in full scale IQ, it was 
found that: 


1. Significant differences exist between sexes 
in DQ, VQ, and PQ, and within sexes in VQ 
and PQ. Young, superior females appear to be 
more “deteriorated” than their age warrants. 
They are significantly inferior to males in 
Arithmetic, in which they also have signifi- 


cantly more low deviant “hard signs.” 


2. Other sex differences in items appear 
which affect the clinical utility of the test, and 
which generally corroborate previous research. 

3. Highly significant scatter patterns exist 
which are functions of both sex and hich) 'O 
level. 


4. Study of order of difficulty of items of 
subtests generally supports Jastak’s [6] con- 
tention that there are important discrepancies 
in item gradation. 


5. All subtests discriminate between Su- 
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perior (IQ 120—123) and Very Superior (IQ 
131—144) groups, but not to the same degree. 
Received February 13, 1953. 
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Why Wechsler-Bellevue Full-Scale IQ’s Are More 
Variable Than Averages of Verbal and 
Performance IQ's 


Julian C. Stanley ’ 


University of Wisconsin 


Many clinicians have been puzzled by dis- 
crepancies similar to those in Table 1.* For per- 
sons whose Verbal and Performance 1Q’s on 
Form I of the Wechsler-Bellevue Intelligence 
Scale average 130, the Full-Scale 1Q is greater 


Table 1 


Wechsler-Bellevue I Data [2] Showing That Full- 
Scale IQ’s Are Not Merely Averages of Verbal 
and Performance IQ’s 











eS wes 

‘ a oa LS 4 

Ss F Eze eae. Y, 

te <8 =) ie ‘he = 
§3 So g§ £2888 &¢ 
10.0 120 140 130 133 
20-24 130 130 130 133 
45-49 70 70 70 67 





than 130, and for those whose V and P IQ’s 
average 70 the FS IQ is less than 70. That this 
result for I1Q’s differing much from 100 is a 
mathematical necessity rather than an indict- 
ment of Wechsler’s tables has been sensed by 
most psychologists, but so far as the writer 
knows no explanation has been published. 
The phenomenon occurs with all kinds of 
averaged measurements that are not perfectly 
intercorrelated. The teacher who gives several 
tests during a quarter and then averages the 
marks each pupil has received notices that 
these final marks include fewer A’s and F’s 
than he customarily assigns to the separate 
tests. The more tests he administers, the great- 


1The writer is greatly indebted to Dr. Jerome P. 
Doppelt and Mr. Robert F. Schweiker for many 
helpful suggestions. 

*See, for example, the comments of Sloan and 
Schneider [1]. 


er will be this “shrinkage.’’ Also, the less reli- 
able and/or related his quizzes are the fewer 
high or low final averages will his students 
obtain. Obviously, distributions of means are 
virtually always less variable than the single- 
score distributions from which they were ob- 
tained. Usually the teacher merely redistributes 
the average marks so that his customary per- 
centages of letter grades are given. This pro- 
cedure is quite proper in most instances, as the 
teacher is probably interested in only the rela- 
tive standings of the students within his own 
group. Ihere are two cases, however, when the 
teacher should be aware of the effect of averag- 
ing: 


1, Average scores may be interpreted as belong- 
ing to a distribution of scores with a 
standard deviation. For example, IQ scores are 
used to identify a person’s performance with respect 
to the performance of others. We say that an indi- 
vidual who attains an IO of 130 has obtained 
score higher than that secured by approximately 97 
or 98 per cent of the people in the standardizing 
group. We so interpret the score because I1Q’s are 
usually scaled to yield a standard deviation of 15- 
1614. If several 1Q’s for the same individual are 
averaged, the mean score is not an IQ, since such 
means have a smaller standard deviation, depend- 
ing upon the number of scores averaged and the 
intercorrelations of the tests used. The will 
not usually be serious if the intercorrelations are 
high. 

2. Average scores may be misinterpreted when 
not all the individuals to be ranked have the same 
number of scores. If there is only one mark for a 
certain student and several for all of the other stu 
dents, that student’s mark should be regressed to- 
ward the rest of the class’s general mean before it 
is included with the averaged marks in the distri- 
bution used to find the final ranking of the students. 
This is likely to be a practically impossible task 
for the usual teacher, however. Here 


“standard” 


error 


again, the 
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seriousness of the error depends on the number of 
scores averaged and the intercorrelations of the 
tests. 


The implications-of this restricted variability 
become obvious to ‘anyone who plots on a pro- 
file several high or low local-norm standard 
scores and the standard score based upon their 
raw-score sum. If the testee did poorly on all 
tests, his summary standard score will be sub- 
stantially below the average of his separate 
standard scores—perhaps lower than any one 
of them. If he did well, his summary score 
will be higher than the average of the separate 
standard scores. The more tests involved, the 
greater these discrepancies are likely to be. 


Technical Note 


It can be shown that when each of N indi- 
viduals has n different test scores, the variance 
of the distribution of the N average scores is 
equal to 1/nth of the sum of the average vari- 
ance of the 7 tests and (m — 1) times the mean 
of the yn(n—1) 


among the n tests: 


different covariances 


Qn eS es 
oy’? = — [o? + (3 — 1) rijoio; | iF j 
n 
o n = 
=—-+- Tj jojo; (1) 
n n 


This formula is the general basis for the 
various statements already made. If there are 
only two tests (m= 2), oy? becomes 14 (0,2 + 
of + When dealing with the 
Wechsler-Bellevue Intelligence Scales, Forms 
I or II, for which the standard deviation of 
Verbal or Performance IQ’s at any age level 
is 14.83 and the correlation between V and P 
IQ’s is about .62, the formula yields a standard 
deviation of 0.9(14.83), or 13.35, a “shrink- 
age” of 10 per cent. Taking the Full-Scale IQ 
of 133 from Table 1, we note that 100 + 


2ri20102). 
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0.9(133 — 100) equals 130, the average of V 
and P 1Q’s shown there. 

it can be proved that the correlation between 
Full-Scale 1Q’s and the average of Verbal and 


‘ V l + rep, 
(oy Ty) a 








Performance 1Q’s is —— —, 
Vo,? + o,* + Zr vy vO p 
where wv and / subscripts refer to Verbal and 
Performance weighted scores (not I1Q’s) at 
the particular age !evel under consideration. 
Probably «, will be somewhat smaller than oy , 
since the intercorrelations of Performance tests 
are in general less than those of Verbal tests. 
If o, and o, were identical or if rp, were 1, the 
formula would equal 1, its maximum possible 
value. 
As n, 


finity, oy? in Formula (1) approaches rj;ojo; , 


the number of tests, approaches in- 


the average covariance. If, however, rijoio; 1S 
zero, oy” equals o*/n, and as n becomes large, 


the variance of the means rapidly gets smaller. 


Summary 

Some clinicians have wondered why the 
average of Wechsler-Bellevue Verbal and Per- 
formance IQ’s differs in most instances from 
the Full-Scale IQ, the former being 10 per 
cent nearer 100. Average and FS 1Q’s do not 
correlate perfectly, so nothing is gained by us- 
ing such averages, which arg got 1Q’s in the us- 
ual sense. 

Reduced variability of individuals’ means is 
common to all kinds of measurements. 


Received February 10, 1953. 
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The Significance of Artistic Excellence in the 
Judgment of Adjustment Inferred from 
Human Figure Drawings "° 


John W. Whitmyre 


Veterans Administration Hospital, Salt Lake City, Utah 


Psychologists who utilize the human figure 
drawing as a projective instrument are often 
asked whether their inferences about the level 
of personal adjustment judged from the 
drawings are not basically inferences concern- 
ing the degree of artistic excellence which the 
drawings possess. A review of the literature 
has revealed no studies aimed directly at de- 
termining the answer to this question, and to 
find information concerning this problem is 
the primary aim of this study. 


While many human figure drawing investiga- 
tions have dealt with the drawings detail by de- 
tail, Albee and Hamlin [1] have recognized that 
this is not the technique utilized by 
their daily use or human 


drawing. They state that the aim of their investiga- 


clinicians in 
misuse of the figure 
tion was to find whether “global or insightful im- 
pressions by clinicians who are experienced with 
projective methods have significant reliability and 
validity” [1, p. 390]. Their method was to permit 
experienced psychologists to rank a series of human 
figure drawings according to the level of persona! 
adjustment which the judges felt was reflected in 
each drawing as a totality. They found exceedingly 
reliable interjudge judgments could be made, as 
well as rather valid ones, with the criterion of ad- 
justment being arrived at from case history mater- 
ial. The was expressly designed to 
avoid evaluating the drawings piecemeal. It is the 
belief that projective test protocols are best ap- 
proached from the configurational point of view 
which led to the plan of the present study. 


investigation 


Statement of Hypotheses 


In order to evaluate the extent to which the 
artistic merit of human figure drawings is re- 


1Submitted to the Graduate School of the Uni- 
versity of Pittsburgh in partial fulfillment of the 
requirements for the degree of Doctor of Philosophy. 

“From the Veterans Administration Regional Of- 
fice, Pittsburgh, Pa. 


lated to the inferences made about the personal 
adjustment of the artist, the following hypo- 
thesis was tested. 

Hypothesis I. There is no significant rela- 
tionship between the degree of artistic excel- 
lence reflected in drawings of human figures 
and level of personal adjustment as judged by 
psychologists from human figure drawings. 

In order to learn whether psychologists are 
able to judge human figure drawings different- 
ly when evaluating them in terms of artistic 
merit rather than in terms of personal ad- 
justment, a second hypothesis was tested. 

lypothesis II. Clinical psychologists judge 
human figure drawings differently when eval 
uating them in terms of level of personal ad 
justment reflected therein, than when evaluat- 
ing them in terms of degree of artistic excel- 
lence. 

Since the value of any psychological test 
ultimately its validity, the fol- 
lowing hypothesis was also tested. 

Hypothesis ITT. Actual level of personal ad- 
justment is more closely related to level of per- 


hinges upon 


sonal adjustment as judged from drawings of 
human figures than to level of artistic excel- 


lence which the drawings are judged to possess. 
Procedure 


Fifty drawings of a man and a woman, both 
on the same sheet of paper, were collected from 
Veterans Administration psychological files. 
All psychiatric patients who had executed 
these drawings were between 22 and 33 years 
of age at the time of testing and were white, 
male, veterans of World War II who had 
achieved Wechsler-Bellevue I1Q’s of at least 
100. All had completed at least eight grades of 
schooling. No patients with known central 
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nervous system damage were used. Fifty hu- 
man figure drawings were then collected from 
veterans functioning adequately in daily life 
and without diagnosis or history of neuropsy- 
chiatric difficulties, but meeting the same re- 
quirements as the patient sample on the vari- 
ables of age, race, veteran status, and educa- 
tional achievement. In neither group were 
fragmentary drawings utilized. 


[wo sets of drawings were then compiled, each 
set consisting of 25 psychiatric and 25 nonpsychi- 
atric cases. In each drawing group, the psychiatric 
and nonpsychiatric subjects whose drawings made 
up that drawing group were equated in terms of 
mean age and education. 

After the obliteration of all identifying informa- 
tion, the above-described groups of drawings were 
presented to a group of eight advanced commercial 
art students for ranking according to the artistic 
merits of the drawings. The drawings were also 
presented to two six-man groups of clinical psy- 
chologists holding Ph. D. degrees and having two 
to six years’ experience with the human figure 
drawing as a projective test. These two groups of 
psychologists were equated for experience with the 
human figure drawing. Where one group of psy- 
chologists ranked a set of drawings for artistic 
merit, the other group ranked the same drawings 
for adjustment reflected, and vice versa. All draw- 
ings were thus evaluated for both art and adjust- 
ment by the psychologists. Each psychologist used 
his own concept of what constitutes “adjustment” 
as it is commonly used by clinicians. In all cases, 
counterbalancing for order of presentation was ob- 
served. It was thus possible to procure artists’ art 
evaluations, psychologists’ art evaluations, and psy- 
chologists’ adjustment evaluations of all of the 
drawings in each group. 

In all cases, the rankings which the judges as- 
signed to the drawings were converted to normal- 
curve base-line deviates, or scale values, according 
to the procedure outlined by Guilford [3]. In order 
to find the values appropriate to the testing of the 
hypotheses of the investigation, correlational pro- 
cedures were carried out utilizing the scale values 
assigned the drawings by the artists and psychol- 
ogists. To find the reliability values, each of the 
judging groups was divided into two equal sub- 
groups, the scale values assigned each drawing by 
the subgroups determined, and the subgroup scale 
values correlated. 


Results 


Reliability. In all judging groups exceed- 
ingly reliable results were found. Table 1 
summarizes the reliability figures. In each case 
the reliability coefficient was found by dividing 
the judging group in half and correlating the 


scale values assigned the drawings by these 
subgroups. All these correlation coefficients 
were then expanded by the Spearman-Brown 
prophecy formula to correct for halving the 
number of judges to compute the correlation. 


Table 1 

Reliability Coefficients and Standard Errors 
for All Judgments Corrected by the 

Spearman-Brown Formula 











Jedges 





Drawing Judged 7 SE 
group for 
Artists I Art 943 3.015 
Artists II Art 936 §=©.018 
Psychologists II I Art 936 =©.018 
Psychologists I II Art 969 §=.009 
Psychologists I I Adjustment .884  .033 


Psychologists II II Adjustment .937  .018 


By the use of Fisher’s r-into-z transforma- 
tion described by Edwards [2], it was possible 
to determine whether any of these reliability 
values differ significantly. The judgments 
yielded by the psychologists when evaluating 
the drawings in terms of adjustment are 
slightly, yet significantly, less reliable than the 
other two judgments, psychologists’ judgments 
for art and artists’ judgments for art. The dif- 
ference is significant at the .05 level of con- 
fidence. 

Hypothesis I. This hypothesis was tested by 
the correlations between the psychologists’ ad- 
justment scale values assigned the drawings 
and the artists’ art scale: values assigned the 
drawings. For one set of drawings, the value 
was found to be .777 + .058 and for the other 
set, .792 + .053. These significant, consistent, 
and moderately high correlations indicate re- 
jection of the hypothesis which states that no 
significant relationship exists between degree of 
artistic excellence reflected in drawings of hu- 
man figures and level of personal adjustment 
inferred by psychologists from these drawings. 
It is apparent that the degree of artistic ex- 
cellence which a drawing possesses is highly 
related to the degree of personal adjustment 
which the drawing is judged to reflect. 

Hypothesis II. This hypothesis was evalu- 
ated with the correlations between adjustment 
scale values assigned a set of drawings by one 
group of psychologist judges and the art scale 
values assigned the same set of drawings by 
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the other, equated group of psychologist 
judges. With one set of drawings, the value 
found was .885 * .039. The correlation for 
the other set of drawings was .859 + .037. 
These high, significant, and similar coefficients 
of correlation indicate rejection of the hypothe- 
sis which holds that psychologists judge human 
figure drawings differently when evaluating 
them in terms of level of personal adjustment 
than when evaluating them in terms of artistic 
excellence. Rather, they seem to have judged 
the drawings in much the same manner in each 
case. They apparently attended essentially to 
the artistic merits of the drawings. This 
impression is supported by application of par- 
tial correlation procedures to determine the 
correlation between the art judges’ evaluations 
of the drawings for art and the psychologists’ 
evaluations of the drawings for adjustment 
with the effects of the psychologists’ art eval- 
uations “held constant,” or partialed out. In 
the case of each set of drawings, the high cor- 
relations between the psychologists’ evaluations 
of adjustment and the artists’ evaluations of 
art fell to insignificance when the psychologists’ 
art evaluations were partialed out. 

Hypothesis IIT. To test this hypothesis, the 
pertinent values were the point-biserial corre- 
lations between the artists’ art judgments and 
the criterion of adjustment, and between the 
psychologists’ adjustment judgments and the 
same criterion. In one case, the correlation be- 
tween the artists’ scale values and the dichoto- 
mous criterion psychiatric patient vs. nonpsy- 
chiatric subject was .095. The psychologists’ 
adjustment values in this case correlated .237 
with the criterion. Neither of these values is 
significantly different from zero. In the case of 
the other set of drawings, the artists’ values 
correlated .101 with the criterion, while the 
psychologists’ adjustment values correlated 
.179 with the criterion. Again, these values are 
not significant. 

It was possible, utilizing the partial corre- 
lation method, to find values representative of 
the correlation between the psychologists’ ad- 
justment values and the criterion dichotomy 
with the influence of the art values assigned to 
the group by the psychologists partialed 
out. In the first instance, this procedure 
resulted in an increase in the correlation from 
.273 to .372, which is a value significant at the 
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01 level of confidence. In the second instance, 
the value of .169 was reduced to .002 when 
the influence of the psychologists’ art ratings 
was partialed out. Hypothesis II] must, then, 
be rejected, for the correlations indicate that 
actual level of personal adjustment is no more 
closely related to level of adjustment as judged 
from drawings of human 
level of artistic excellence which the drawings 
are judged to possess. Also, the influence of the 
psychologists’ unconsciously judging according 


figures than to 


to artistic merit in one instance raised, and in 
the other instance lowered, the correlation be- 
tween their adjustment ratings of the drawings 
and the criterion of adjustment. 


Discussion 


Perhaps the most clear-cut finding of this 
investigation is that there is a significant re- 
lationship between level of personal adjustment 
judged to be reflected in human figure draw- 
ings and the degree of artistic merit which they 
are judged to possess. Without regard to the 
validity or lack of validity of the human figure 
drawing as a clinical instrument, it appears 
that the psychologists who served in this in- 
vestigation ranked the drawings in such a way 
that their ratings corresponded to a marked 
extent with the art evaluations by the artists. 
Furthermore, the high correlations between the 
art and adjustment evaluations made by the 
two groups of equated psychologist judges in- 
dicate that, to a marked extent, they evaluate 
the drawings in much the same manner, 
whether they set out to rate art or adjustment. 

It must be remembered that in each case 
the psychologists who rendered the art values 
for a given group of drawings were not the 
same psychologists who rendered the adjust- 
ment values for those same drawings. Here 
the assumption that these two groups of psy- 
chologist judges are equal in ability to deal 
with the figure drawings is a most basic and 
necessary one. The similarity of the reliability 
and other comparable correlational values sug- 
gests that this assumption was a safe one. 

Furthermore, it appears that neither art nor 
adjustment ratings show any appreciable re- 
lationship to the criterion of adjustment, pa- 
tient or nonpatient status. If cne accepts this 
rough criterion of personal adjustment, one 
must account for the failure of the ratings of 
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these “average’’ clinical psychologists to dis- 
tinguish the two differently adjusted groups. 
Observation reveals that many of the drawings 
by “normal” veterans are fully as inadequate 
as many seen in NP hospital files. ‘This is con- 
sistent with many of the findings of Roe [4, p. 
336], who has done extensive work involving 
projective tests with normals. She has found 
that many disturbed projective test protocols 
are yielded by apparently well-functioning in- 
dividuals. Her interpretation is that while the 
basic difficulties as projected on tests may be 
the same in the adjusted and maladjusted, the 
former group has faced and dealt with the diffi- 
culties while the latter group has not [4, p. 
339}. 

If it is true that underlying personality 
stricture can be similar in two persons, one 
well-adjusted by society’s standards and‘ one 
inadequately adjusted, then it appears that this 
underlying structure may be the feature of in- 
dividual make-up which projective tests tap. 
If it is not possible for the average clinical psy- 
chologist to differentiate those who are well- 
adjusted from those who are not on the basis 
of the human figure drawing, it appears that 
use of the instrument to assay level of adjust- 
ment is questionable. Either more adequate and 
communicable means of determining level of 
adjustment on the basis of drawings is neces- 
sary, or else psychologists should perhaps limit 
themselves to drawing inferences about fea- 
tures other than adjustment level from the 
drawings. 


Summary and Conclusions 


Human figure drawings were collected from 
psychiatric patients and “normal’’ veterans. 
Clinical psychologists ranked the drawings ac- 
cording to the level of personal adjustment 
which they felt was reflected in the drawings. 


They also ranked the drawings according to 
degree of artistic excellence. Commercial ar- 
tists also ranked the drawings for artistic mer- 
it. The results were as follows. 


1. Extremely reliable evaluations were 
achieved by all judging groups. 

2. The clinical psychologists utilized the ar- 
tistic merits of the human figure drawings to 
a large extent in their adjustment evaluations 
of them. 

3. The psychologists judged the drawings 
in much the same manner, whether consciously 
judging according to art or adjustment. 

4. Neither art nor adjustment ratings by 
artists or psychologists show any consistently 
significant relationship with the dichotomy 
psychiatric patient vs. nonpsychiatric subject. 
The art values sometimes depress, sometimes 
inflate, the relationship. 

5. As judged by the ‘“‘average” clinical psy- 
chologist today, human figure drawings ex- 
ecuted by persons of average or above-average 
intelligence seem to indicate art achievement 
but do not seem to show any consistent rela- 
tionship to level of personal adjustment. 
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Psychotherapists are generally agreed on the 
principle that the therapist is personally in- 
volved in the relationship as much as the pa- 
tieut. According to Jung [4], both are mutu- 
ally influenced in the treatment. Sullivan 
[9] regards the therapist as a “participant ob- 
server.” In Rogers’ opinion, “.. . to treat an- 
other person as a person is to open oneself to 
change through the influence of the relation- 
ship” [7, p. 175]. From this conception of 
therapeutic interaction, Freud [3] and Obern- 
dorf [6] deduce that even competent thera- 
pists may occasionally be unsuccessful because 
of certain aspects of their physical appearances, 
their personalities, or their blind spots. 

Like the therapist, the administrator of the 
Rorschach test may also tend to elicit certain 
reactions from his subjects. In that event he 
should obtain protocols that are different from 
those given by subjects of examiners with dif- 
ferent characteristics. The present paper is con- 
cerned with this hypothesis of a significant 
over-all difference in protocols obtained by 
various examiners even when they are testing 
similar groups of subjects. This has already 
been demonstrated by Cleveland [2] and San- 
ders [8] for relatively inexperienced examin- 
ers who tested college sophomores under the 
highly controlled and presumably artificial 
conditions of an experiment. The purpose of 
the present study was to test this finding with 
experienced examiners who administered the 
tests under the standard conditions of the clinic 
to equated groups of psychiatric patients. 


Subjects 
Patients. Cases were selected from among 


1From VA Regional Office, Detroit, Michigan. 


the most recent patients tested by each of 12 
examiners at the Mental Hygiene Clinic of 
the Veterans Administration Regional Office, 
Detroit, Michigan. The assignment of cases 
to the clinical psychologists at the Detroit 
Clinic is determined by the days of the week 
when the patients are first admitted. Since 
there is no policy of scheduling any particular 
type of complaint on a given day, each psy- 
chologist had an equal chance of being assigned 
patients in all the diagnostic categories. A tabu- 
lation of cases of the different examiners re- 
vealed no significant differences in the propor- 
tions of various pathologies. The diagnoses of 
the final sample range from simple neurosis to 
prepsychotic condition. To facilitate the inter- 
pretation of results all subjects who were se- 
lected were male, white, between the ages of 
25 and 32, not disoriented, and their symp- 
toms were functional rather than organic. 
Psychologists. Three standards were used 
in the selection of examiners. The first was 
that they have a minimum of two years of ex- 
perience at the clinic. This helped to insure a 
common theoretical frame of reference and the 
use of similar techniques. They all admin- 
istered the Rorschach test according to Beck’s 
[1] instructions. Second, clinicians were se- 
lected only if they had tested at least 20 cases 
who met the criteria for the choice of subjects. 
Finally, an attempt was made to obtain diver- 
gent personalities on the basis of ratings of the 
psychological staff by two of the writers. 
There were nine males and three females. 


Method 


All records were first coded so that they con- 
tained no obvious clue to the identities of the 
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examiners or subjects. One of the authors then 
scored all the protocols by Beck’s method. To 
provide for the calculation of reliabilities, a 
randomly selected sample of records was also 
scored by a second person. 

The data were analyzed in two ways. First, 
the raw or absolute scores of the determinants 
R, F, M, FC, CF, C, FY, YF, and V were 
used in analyses of variance to determine whe- 
ther examiners differed significantly from each 
other in the values they produced in their pro- 
tocols. The F’s for each of these determinants 
are contained in Table 1. It was originally 
planned to include Y, but this scoring cate- 
gory was not used because of the paucity of 
responses in this category. 

Second, in order to control for unequal 
numbers of responses, separate analyses were 
made of the proportions of scoring categories 
in relation to lengths of protocols. Since all 
data in the percentage column are ratios of 
number of responses, there can obviously be no 
entry under R. Because percentages are not 
normally distributed, they were converted to 
arc-sine units. Then, as in the cases of the raw 
scores, analyses were made of the variances 
among examiners of arc-sines of the eight de- 
terminants. The F’s for these analyses are also 
contained in Table 1. 


Results 


The interjudge reliabilities have a mean of 
.91 and range between .66 and 1.00. Only 
three of the nine are below .90. The low cor- 
relation of .66 is attributable in part to the 


Table 1 


Differences in Rorschach Determinants 
of Examiners 








Scoring symbol Absolute score Percentage score 





F F 
R 1.39 
F 1.83* 2.00** 
M 1.30 58 
FC 1.15 1.42 
CF -60 .06 
Cc 2.06* 2.37 
FY 2.56%* 1.29 
YF 1.40 1.91* 
V 1.45 1.65 





* Significant at the 5 per cent level or less. 
** Significant at the 1 per cent level or less. 


small number of YF responses, only 86 in 
249 records. 

Table 1 lists variances among examiners for 
absolute and percentage scores. Of the nine 
absolute scores, variances among examiners for 
three, F, C, and FY, are significant at the 5 per 
cent level. This is approximately seven times 
the number of significant variances that might 
occur by chance. Variances among examiners 
on three of the eight percentage scores, F, C, 
and YF, are significant at the 5 per cent level. 
This is approximately eight times the number 
that would be expected by chance. The results 
thus indicate that there are significant over-all 
differences in the determinants obtained by 
various examiners from comparable groups of 
subjects in an outpatient mental hygiene clinic. 

Some indications of the magnitudes of dif- 
ferences among examiners are afforded by the 
ranges of means of the significant determi- 
nants. In the case of F, there is a considerable 
range of means, from 10.0 to 20.8. If F is 
given its conventional interpretation as an in- 
dicator of control, the possibility arises that 
when testing members of a homogeneous pop- 
ulation some examiners will elicit protocols 
which indicate overconstriction while the re- 
cords obtained by others will indicate weak 
control. In view of the range of means for FY, 
1.0 to 3.4, it is also probable that certain ex- 
aminers in this clinic tend to obtain excessively 
dysphoric records while others with compar- 
able patients may seldom elicit such reactions. 

It is more difficult to assay the significance 
of the results for C and YF, the determinants 
with smaller ranges. Means of C obtained by 
the different examiners vary between zero and 
.55; means of YF range between .05 and .70. 
The absolute magnitudes of these ranges are 
sufficiently small to suggest that the significant 
variances may be a function of the sample of 
patients. In the case of YF this is made more 
probable by the previously mentioned small 
number of responses in that category. Thus, 
despite the significant variances among ex- 
aminers for C and YF, confidence in these re- 
sults must depend upon the repetition of the 
study in the same clinic. 

Assuming the results are stable, the magni- 
tudes of variances should be judged in terms 
of current standards of interpretation. In some 
instances, particularly those involving patho- 
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gnomonic signs, the presence of certain deter- 
minants even in small amounts may affect the 
interpretation quite radically. This is particu- 
larly true of pure color, so that the narrow 
range of means for C may still point to a sig- 
nificant source of error. 


Discussion 


To what extent can these results be applied 
to the records obtained in different clinics? The 
answer to this question is determined in great 
part by theoretical orientation. In the opin- 
ions of the writers, the subject’s responses indi- 
cate how he organizes the drives aroused in 
the course of his interaction with a particular 
examiner in a specific situation. From this 
premise it may be deduced that the administra- 
tion of the test by different examiners in the 
same setting or the same examiner in different 
settings might produce different protocols. 
This deduction creates considerable difficulties 
for the design of corroborative studies and the 
establishment of norms. However, it also sug- 
gests fruitful hypotheses. 

If this investigation were repeated in an- 
other installation, even identical results could 
not be viewed as validating those of the pres- 
ent study unless conditions were identical. If 
the conditions were not identical, significant 
differences among examiners for more deter- 
minants might reflect more disturbed or less 
experienced examiners. A complete lack of sig- 
nificant variances might result from marked 
similarities in the interpersonal habits of the 
examiners, or from the fact that they were ex- 
perienced. Similarly, the data obtained in con- 
nection with variations in setting or in type of 
patient could not be compared with those of 
the present sample. 

Strictly speaking, then, the present results 
can apply only to a situation homogeneous 
with that of the present clinic. This principle 
should also hold for the establishment of 
norms. A partial recognition of this fact is af- 
forded by the frequently repeated statement 
that Rorschach norms can be used only within 
wide limits. In part, clinicians claim, the ab- 
solute magnitudes of the determinants take on 
different meanings depending upon the con- 
texts or gestalten in which they occur. More- 
over, it is the hope of each examiner that, with 
sufficient experience, he can learn to establish 


~ 


a group of subjective corrective factors that 
can cancel such sources of error as his idio- 
syncratic effect on the patient’s responses or 
unusual variations in the settings or patient's 
motivation. Such a faith, however, is based on 
the assumption, one with which the writers 
are inclined to agree, that once the sources of 
error are eliminated, there are absolute norms 
which are comparable in different situations. 

Rather than stress lack of generality, then, 
it would be more fruitful to inquire into the 
conditions under which results can and cannot 
be generalized. Some light may be thrown on 
this problem by the basic dimensions that Mil- 
ler [5] attributes to the interpersonal relation- 
ship between examiner and subject. The sub- 
ject’s character structure constitutes one group 
of variables. Within this group, the degree of 
pathology of the symptoms may bear upon 
the degree of variance among examiners. For 
example, experience has taught most clinicians 
that, with certain exceptions, the greater the 
pathology, the more inflexible are the responses 
of the patient. In other words, regardless of 
the examiner’s characteristics, the schizophrenic 
or compulsive may tend to order indiscrimi- 
nately their relationships with him by means of 
the same defenses with which they structure 
most of their experiences. Normal subjects, on 
the other hand, should be more sensitive to the 
examiner’s physical and psychological charac- 
teristics. Other things being equal, then, the 
number of significant variances among exam- 
iners should decrease with increasing pathol- 
ogy. This hypothesis should not be difficult to 
investigate. The Rorschach test might be ad- 
ministered by the same group of examiners in 
the same setting to three groups of patients, 
normal, neurotic, and psychotic. Normal pa- 
tients might be selected from medical clinics 
where there could be some insurance that their 
symptoms were not psychogenic. 

A second variable that may bear upon com- 
parability of data is the relative experience of 
the clinicians. If three examining groups, each 
with differing amounts of experience, were 
used in the experiment with the three groups 
of patients, the design could utilize the analy- 
sis of variance technique. Not only would this 
require fewer examiners than other methods, 
but analyses could be made of each of the vari- 
ables separately, of their interaction, and of 
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combinations. The value of the latter is illus- 
trated by the prediction that there are more 
significant variances among inexperienced ex- 
aminers testing normal subjects than among 
very experienced clinicians with psychotic pa- 
tients. In the latter case, significant variances 
may tend to disappear, thus enabling the clin- 
icians to share norms. Comparable studies of 
additional dimensions of the interpersonal situ- 
ations should contribute significantly to the 
identification and ultimate measurement of 
the primary sources of error on the Rorschach 
test. 
Summary 


This study is concerned with differences in 
Rorschach protocols obtained by various ex- 
aminers from comparable groups of patients in 
a mental hygiene clinic. Factors controlled in 
the matching of groups were sex, race, age, 
orientation, and organicity. 

Of nine absolute scores of determinants, 
three, F, C, and FY, are significant at the 5 
per cent level. Three of the eight percentage 
scores, F, C, and YF, have 5 per cent prob- 
abilities of occurrence, by chance. The signifi- 
cant results are respectively seven and eight 
times the number that would be expected by 
chance. In short, the examiners differ signifi- 
cantly from each other in the determinants 
they elicit from comparable subjects. 


A theoretical frame of reference is proposed 
for the interpretation of results and the plan- 
ning of further research. 


Received February a, 1953. 


References 


1. Beck, S. J. Rorschach’s test. Vol. I. Basic pro- 
cesses. New York: Grune & Stratton, 1944. 

2. Cleveland, S. E. The relationship between ex- 
aminer anxiety and subjects’ Rorschach scores. 
Unpublished doctor’s dissertation, Univer. of 
Michigan, 1950. 

3. Freud, S. Recommendations for physicians on 
the psychoanalytic method of treatment. In 
Collected papers. Vol. I1. London: Hogarth Press, 
1946. Pp. 323-333. 

4. Jacobi, J. The psychology of Jung. New Ha- 
ven: Yale Univer. Press, 1943. 

5. Miller, D. R. Prediction of behavior by means 
of the Rorschach test. J. abnorm. soc. Psychol., 
1953, 48, 367-375. 

6. Oberndorf, C. P. Unsatisfactory results of psy- 
choanalytic therapy. Psychoanal. Quart., 1950, 
19, 393-407. 

7. Rogers, C. R. Where are we going in psy- 
chology? J. consult. Psychol., 1951, 15, 171-177. 

8. Sanders, R. The relationship between examiner 
hostility and subjects’ Rorschach scores. Un- 
published doctor’s dissertation, Univer. of Michi- 
gan, 1950. 

9. Sullivan, H. S. Conceptions of modern psychi- 
atry. Psychiatry, 1940, 3, 1-117. 


». 




















Journal of Consulting Psychology 


Vol. 17, No. 6, 1953 


Personality Factors in Seizure States with Reference 
to the Rosenzweig Triadic Hypothesis 


Francis M. Canter 
Walter Reed Army Hospital 


There has been a great deal of controversy 
concerning the role of personality factors in 
the production and aggravation of epileptic 
phenomena. Some investigators have tended to 
regard all seizures as expressions of emotional 
conflict, whereas others have differentiated be- 
tween epilepsy as an organic, probably in- 
herited, illness and hysterical seizures in which 
psychopathology is the predominant feature. 
Many workers, of course, who have written 
on this subject make no definite distinctions be- 
tween organically based seizures and hysterical 
attacks. Instead, epilepsy and hysteria are seen 
as opposite poles of a continuum where organic 
or constitutional factors and psychological 
maladjustments operate together in an intricate 
fashion. 

The psychoanalytically oriented writers have 
most consistently emphasized the basic psycho- 
logical disorder which eventuates in epileptic 
symptomatology. While they do not deny that 
organic predispositions may exist, they regard 
the evidence from electroencephalography, 
drug treatment, and surgery as of little conse- 
quence for a real understanding of the attacks. 
Fenichel [4, pp. 265-267], Jelliffe [6], Mit- 
telmann [9], Clark [1], and others have 
pointed to repressed hostility, extreme nar- 
cissism, and the regressive nature of the seizure 
as indicative of its basically psychological ori- 
gin. 


1The present paper is a condensation of part of a 
doct. al dissertation submitted to the Graduate 
Board of Washington University. The author wishes 
to express his gratitude to the sponsor of the disser- 
tation, Dr. Saul Rosenzweig, for his encouragement 
and assistance. The author also gratefully ack- 
nowledges the many helpful suggestions of Dr. 
Wilse Webb. The research could not have been ac- 
complished without the constant cooperation and in- 
terest of Col. Donald Peterson, MC, USA, and Dr. 
Earl Swartzlander of the Neuropsychiatric Service, 
Fitzsimons Army Hospital. 


Lennox [7] is one of the main protagonists of the 
point of view that epilepsy is essentially a neuro 
logical disease and should be clearly differentiated 
from the psychogenic seizure episodes, which may 
be considered hysterical. Those who hold this view- 
point do not disregard the role of psychological 
stresses in the aggravation or even precipitation of 
they regard these 


stresses as no more important than any other stress, 


the convulsive syndrome, but 
physical or environmental, with which the organ- 
ically weakened organism cannot cope. Consequent- 
ly, they believe that epilepsy is principally a patho 
physiological problem, and that in cases where no 
term hys 


brain dysfunction is demonstrable, the 


teria may be applied. Both epileptic and “psych 


genic” seizures may occur, of course, in the same 
individual, and these cases will present special di 
agnostic and treatment problems. However, in a re 
cent restatement of his position, Lennox [8] has es 
that 95% of i 


perience had no hysterical seizures, alone or in cor 


timated epileptic patients in his ex 


junction with the epileptic attacks. 
There is increasing evidence that any individual 


may have convulsive symptoms under the right 
conditions. Some investigators, e.g., Gastaut [5], 
have recently attempted to discover the relative 


“convulsive thresholds” of varying subjects such as 
idiopathic epileptics, schizophrenics, and hysterics. 
From this standpoint, the designation of “epilepsy” 
or “hysteria” becomes an arbitrary one, depending 
on the level at which it is concluded that the or 
ganic weakness is a more deciding factor than the 


involved. Pre 


low convul- 


psychological stress which may be 
individual with a 
threshold succumb to 
whether psychological or physiological, whereas an 
individual with a high threshold 
under all but quite severe stress. 


sumably, an very 


sive would minor stresses 
would bear up 
Nevertheless, the 
differential distinction between epileptic and psy- 
chogenic seizures continues to be used as a practical 


prerequisite for therapy. 


The present study is an attempt to outline 
certain personality patterns in two groups of 
seizure patients. A group of patients diagnosed 
as idiopathic epileptic will be compared with a 
group of patients whose seizures are attributed 
to emotional factors and who are considered 
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hysterical. It is expected that the patients with 
hysterical seizures will, as a group, show evi- 
dence of greater psychological disturbance than 
the group of epileptic patients. If this hypothe- 
sis is sustained, it will provide significant sup- 
port for the concept that the psychological in- 
volvement need not be as pronounced for pa- 
tients with an organic weakness. 

With respect to the personality variables on 
which the two groups might be expected to 
differ, the Rosenzweig triadic hypothesis was 
considered applicable. This hypothesis postu- 
lates that the availability of repression as a de- 
fense mechanism, the tendency to react to frus- 
tration in an impunitive manner, and relative 
ease of hypnotizability will be found in positive 
association with each other. The hypothesis is 
based on observation of the appearance of 
these factors in hysteria, and on experimental 
studies [11, 15]. 

With the hypothesis as a “guidepost,” it was 
postulated that a psychogenic seizure ter- 
ical) group would (a) repress to a greate: ex- 
tent, (4) be hypnotizable to a greater extent, 
and (c) react to frustration in an impunitive 
manner to a greater extent than a group of ep- 
ileptic patients.’ 

In addition, it was postulated that the psy- 
chogenic seizure group would, on projective 
tests, give more responses indicative of anxious 
and hostile tensions, whick might be assumed 
to stem from the dynamic needs and conflicts 
which are denied expression in more overt be- 
havior. 


Selection of Subjects and Procedure 


Subjects for the study were obtained from 
the Neuropsychiatric Service of Fitzsimons 
Army Hospital. All male, military patients ad- 
mitted to the neurological section with primary 
symptoms of seizures were subjected to an 
elaborate and intensive examination with the 
purpose of distinguishing between those pa- 
tients whose seizures result from organic dys- 
function and those whose seizures are symp- 
toms of psychological maladjustment. The ul- 
timate differential diagnosis was formulated 
mainly on the basis of neurological and elec- 


2The relationships between the variables of the 
triadic hypothesis, as they are manifested in the 
seizure groups, will be discussed in a separate 
paper. 
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troencephalographic data. An additional cri- 
terion for differential diagnosis was the ability 
of the subject to recall, under hypnosis, the 
events of the height of the seizure, for which 
there is, presumably, true amnesia in the gen- 
uinely epileptic patient. A detailed description 
of the procedure leading to this differential 
diagnosis has been given by Peterson, Sumner, 
and Jones [10]. 

In each case, the subject was seen first for 
hypnotic sessions by a military psychiatric resi- 
dent and then by the author for the psychologi- 
cal experimentation. Neither the author nor 
the hypnotist, at the 


knew anything of the results of any previous 


time of examination, 


tests or examinations upon which the diagnosis 
might depend, nor the results of the other’s 
examination. The experimental variables, other 
than recall under hypnosis, did not enter into 
the final diagnosis. 

The two experimental groups for this study 
were as follows: 

1. Forty-four patients diagnosed as having 
idiopathic epilepsy. This group will be termed 
the epileptic group. 

2. Forty-five patients diagnosed as neurotic 
or character disorder, whose seizures are con- 
sidered symptoms of psychological maladjust- 
ment. This group will be termed the psycho- 
leptic group. 

Since the subjects were tested as they came, 
without indication at that time of what the 
ultimate diagnosis might be, it was impossible 
to match the groups in advance on such vari- 
ables as intelligence, age, etc. However, the 
two groups as finally constituted do not dif- 
fer significantly in these respects. In each 
group, the average IQ (Wechsler-Bellevue 
Full Scale) is 


mately 23, 


101, average age is approxi- 
and average education is slightly 
over 10 years. On military rank and length of 
military 
parable. 
The order of administration of the experi- 
mental measures fo, each subject was as fol- 
lows: hypnosis, reaction to frustration (Rosen- 
zweig P-F Study), repression measure, Ror- 
schach test, Thematic Apperception Test (11 
selected cards), and Wechsler-Bellevue test. 


service, the groups are also com- 


Hypnosis was measured by the number of hyp- 
notic criteria met by the subject during and after 
These criteria 


induction of the trance. include 
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drowsiness, anesthesias, catalepsies, amnesias, hal- 
lucinations, and regressive phenomena, these states 
in this order presumably 
from a relatively 
nosis. There were 12 criteria in the list. The hyp- 
notist also made an over-all judgment as to whether 
a hypnotic state was induced, and, if so, whether it 
was a medium trance or a deep, somnambulistic 
state. 

To obtain a measurement of reaction to 


indicating a progression 
light to a relatively deeper hyp- 


frustra- 
tion, the Rosenzweig Picture-Frustration Study was 
employed. This is a limited projective technique 
consisting of a series of cartoon-like situations, each 
of which presents a common frustrating situation. 
The subject’s written responses for each situation 
are scored as extrapunitive (outward direction of 
aggression), intropunitive (inward direction of ag 
(denial or dismissal of 
the frustration), in accordance with the scheme of 
Rosenzweig [14]. Norms for the test are available. 
Scoring was done in accordance with the criteria 


gression), or impunitive 


and samples provided in the manual, 

Repression was measured by means of the tech 
nique employed by Rosenzweig in previous studie 
f12, 15]. 
pictured on each, were administered to the subject 
rhe 


ject was told that he was expected to do well on 


Sixteen jigsaw puzzles, with an object 


under the guise of an intelligence test. sub 
all puzzles, and that his score would be evaluated 
of other 
vices to increase tension and promote involvement 
in the task were used 


in terms individuals’ scores. Various de 
such as constant reference to 
the stop watch, frowning, and criticism of the sub 
On eight of the puzzles the 


subject was permitted to finish, but on the other 


ject’s performance. 


eight, he was forced to stop before finishing and 
sub 
ject was allowed to finish the first, second, fourth, 
sixth, ninth, tenth, twelfth, and fourteenth puzzles 
in the series, but was not allowed to complete the 
remainder. After the last puzzle, there was a 15 
minute recess, and then the subject was asked to 
write down the names of all the puzzles that he 
could remember. 


was informed that the time limit was up. The 


It was assumed that, under the 
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foregoing conditions, completed puzzles would be 


and incompleted puzzles 


experienced as “successes” 
as “failures.” 
if the 


his 


Repression was considered as shown 


subject recalled more of his successes than 
failures. 

The Rorschach test and 11 
edition of the Murray 
Test administered to 


yrotocols analyzed for content and thematic indica 
I ; 


cards of the 1943 


Phematic Apperception 


were each subject and the 


tion of anxious and/or hostile preoccupations he 
14, 12M, 
The 


indi 


TAT cards were, in order administered, 2, 
6BM, 18M, 13MF, 5, 18GF, 12B, 9BM., 


Rorschach protocols were scored content 


, and 
for 
and hostility in the 


[3]. The TAT 


manner by indicating whether 


cations of anxiety manner de 


scribed by Elizur stories were 


scored in a similar 
the 


hostility, or 


theme of each story suggested anxiety and/or 


was “neutral” in content. It should be 


pointed out, in this connection, that the total anxi- 


ety-hostility score is considered by the author to 
be a more meaningful measure than either the 
anxiety score or hostility score separately, since 


The total 


relatively 


each score contains elements of the other. 


score, on the other hand, probably is a 
stable estimate of the intensity or generality of psy 


chological tensions which may be expressed as 


anxiety on one occasion and hostility on another 


ored without know! dge of 


subject. The 


All tests were 


the final diagnosis of the scoring 


on the Picture-Frustration, Rorschach, and 
TAT was independently checked by a psy 
chology graduate student familiar with the 
coring used. A check of a sample of approxi 
mately one-third of the cases showed correla 


tions with the author’s scoring of .94 to .97 
P-F for the Rorschach 


total score, and .90 for the total TAT score. 


for the variables, .95 


Results and Discussion 


I'he psycholeptic group repressed to a sig 


nificantly greater degree than the epileptic 


Table | 


Comparison of Groups on Experimet 


tal Variables 


Epileptic Psycholeptic 

Experimental variables (N = 44) (N +5) 
Mean SD Mean SD t 

Repression S-F score —0.068 1.876 1.622 1.911 4.15*** 
Hypnosis No. criteria 6.14 +.67 4.87 4.18 1.33 
Impunitiveness 28.10 10.35 31.25 11.00 1.37 
Intropunitiveness 27.21 8.28 30.70 9.43 1.31 
Extrapunitiveness 44.02 14.75 37.95 16.91 1.78* 
Rorschach Total A-H score 8.16 6.03 11.78 12.08 1.77° 
TAT Total A-H score 12.43 4.04 14.82 4.55 2.62°° 








* Significant at the .10 level. 
** Significant at the .05 level. 
*** Significant at the .01 level. 
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group. In the psycholeptic group, 33 subjects 
recalled a preponderance of successes, 4 sub- 
jects recalled a preponderance of failures, and 
8 recalled a preponderance of neither. In the 
epileptic group, 15 subjects recalled a prepon- 
derance of successes, 21 recalled more failures, 
and 8 recalled a preponderance of neither. 
Comparison of these frequencies results in a 
yx value of 16.34, which is significant beyond 
the .01 level of confidence. 

In Table 1 the groups are compared on re- 
pression in terms of a score consisting of the 
number of successes recalled minus the number 
of failures recalled. In terms of amount of 
repression as well as mere number of cases of 
occurrence, the psycholeptics repress more 
than the epileptics. Both experimental groups 
were further compared with the results from a 
nonpatient group reported by Rosenzweig 
[12]. The mean success-minus-failure score of 
this group was 0.167 with a standard deviation 
of 1.683. There is no significant difference 
between the nonpatient, “normal” group and 
the epileptic group, whereas the psycholeptic 
group represses more than the nonpatient 
group, the difference being significant at the 
.01 level of confidence. 

The epileptic and psycholeptic groups did 
not differ in significant degree with respect to 
hypnosis (Table 1). In terms of the dichoto- 
mization of hypnotized versus nonhypnotized, 
exactly 25 subjects in each group were hypno- 
tized to a medium or somnambulistic level. 

In the interpretation of these results certain 
factors of possible importance should be noted. 
When an epileptic subject was hypnotized, he 
generally met all the criteria up to a given 
level without gaps or tests which he could not 
or would not meet. A large proportion of the 
hypnotizable psycholeptics, however, met the 
criteria in an erratic fashion, meeting some cri- 
teria supposedly indicating deeper hypnosis 
but not meeting criteria indicating lighter 
hypnosis. A recent paper by Ehrenreich [2] 
suggests that this phenomenon may be ascrib- 
able to unconscious resistances to those par- 
ticular activities. If this is the case, it would 
seem that unconscious resistance to hypnosis 
was operating to a much greater extent in the 
psycholeptic group than in the epileptic. In 
support of this possibility, there are some be- 
havioral and questionnaire data (not reported 
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here) on the present subjects to suggest that 
the psycholeptics were, in general, more resis- 
tive to treatment and examination, possibly 
because of a realization that their seizures 
might be gainful to them and/or because of 
their greater difficulty in adjustment and es- 
tablishment of interpersonal relations. Of all 
the measures, hypnosis was probably the most 
susceptible to interference from situational 
factors, because of the preconceived attitudes 
toward it held by many of the subjects and be- 
cause many apparently feared this seemingly 
mysterious and unconventional procedure. 

In Table 1 is also presented a comparison of 
the two groups on the variables of impunitive- 
ness, intropunitiveness, and extrapunitiveness. 
In addition to comparing the groups with each 
other, each group’s scores were compared with 
the Rosenzweig revised norms [13]. None of 
the epileptic mean scores differs significantly 
from the reported norms. However, the psy- 
choleptic group is less extrapunitive than the 
norm (mean: 45; standard deviation: 13.3). 
This difference is significant at the .01 level 
of confidence. Likewise, the psycholeptic 
group is more impunitive than the norm 
(mean: 27; standard deviation: 9.45), this 
difference being significant at the .05 level of 
confidence. The psycholeptic group shows a 
tendency to be more intropunitive than the 
norm (mean: 28; standagd deviation: 8.25), 
though this difference does not reach a level of 
statistical significance. 

It was expected that the psycholeptic group 
would be more impunitive and less intro- or 
extrapunitive, but this was not the case. In- 
stead, there is a tendency for the psycholeptics 
to be more intropunitive and impunitive and 
less extrapunitive than the epileptics. The di- 
vision seems to be one of extrapunitiveness 
versus nonextrapunitiveness, or restraint of 
outward aggression rather than restraint of all 
aggression. This may be a significant finding 
in view of the interpretation of the psycho- 
genic seizure as a turning inward of pent-up 
hostility. 

A factor to be considered here is the ambigu- 
ity of the intropunitive response itself, a 
point previously noted by Rosenzweig and Sar- 
ason [15]. In many instances, the intropunitive 
response on the P-F Study seems to be an ex- 
aggerated impunitive type of reaction rather 
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than a reflection of real guilt feelings. Such 
answers as “I’m very sorry” have become so 
conventional and stereotyped that with many 
subjects they appear to mean only an vasy way 
of getting out of the frustrating situation or 
glossing over the problem, in which actual self- 
aggression is probably absent. ‘Thus, the intro- 
punitive and impunitive responses may repre- 
sent the same nonaggressive trait. In this sense, 
the essence of the original postulation, restraint 
of hostility, appears to have been substantiated 
to some degree. 


A comparison of the groups with respect to 
psychic tension, as indicated by the Rorschach 
and TAT anxiety-hostility scoring, is presented 
in Table 1. The comparisons are made in terms 
of the Total A-H score, which, as previously 
mentioned, is considered a more stable measure 
of the degree to which psychological tension 
is expressed in the content and the degree to 
which neutral or pleasant concepts are aban- 
doned. 


There was no difference in mean number of 
Rorschach responses given by each group, but 
there was a definite tendency for the psycho- 
leptic group to give more responses with anx- 
ious-hostile content. Similarly, the psycholeptics 
told more anxious-hostile —TAT stories and 
fewer neutral or pleasant stories. In general, 
then, “‘psychic tension” appears to be greater 
in the psycholeptic group. 





Table 2 
Correlation between Psycholepsy and 
Experimental Variables 
ii=—_€€: ao" ¢@ - 
variables 
Repression S-F score 58 .01 
Hypnosis No. criteria -.03 N.S. 
Extrapunitiveness -.25 05 
Intropunitiveness 
plus impunitiveness .22 .10 
Rorschach Total 
A-H score 25 05 
TAT Total A-H score 35 01 





In Table 2 the results of the study are pre- 
sented as biserial correlations between a diag- 
nosis of psycholepsy and increasing scores on 
the different variables, with psycholepsy-epilep- 
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sy as the forced dichotomy. The single most 
discriminating variable is repression, a finding 
which suggests that this may be a more basic 
personality factor than the others in the dif- 
ferentiation of epileptic from psychogenic. The 
repressive aspect of the seizure is certainly the 
most immediate and outstanding feature, i.e., 
the “black-out,” etc., 


factor of repressed hostility is a derived or in- 


amnesia, whereas the 
ferred construct which becomes obvious only 
after considerable study and interpretation of 
various behavior patterns. 

In spite of the lack of difference between 
the epileptics and psycholeptics on the hypno- 
sis variable, the latter group presents a pattern 
of response consistent with what might be ex- 
pected of a hysterical group. The most definite 
difference is on repression, which has from the 
first been linked with hysterical disorders. The 
evidence for repressing or suppressing aggres- 
Picture-Frustration 
Study, is consistent with the interpretation of 
the psychogenic attack as a release or expres- 
sion of pent-up affects which the subject does 
not 


sion, as shown on the 


consciously recognize. The finding of 
greater “psychological tension” in the psycho- 
leptics, as shown by the Rorschach and TAT 
responses, is consistent with the Picture-Frus- 
tration results, if one assumes that the responses 
on the Rorschach and TAT reflect more un- 
conscious attitudes, whereas the P-F responses 
are closer to realization of their 
meaning. The making of stories or the interpre- 
tation of inkblots is apt to be farther from the 
subject’s conscious control or censorship of his 
responses than his answers to the P-F situa- 
tions, which reflect to some degree a knowledge 
of, and willingness to abide by, social conven- 
tions. 


conscious 


Although there were some possibly invalid- 
ating factors with respect to the hypnosis, 
there remains the possibility that hysterics are 
actually no better hypnosis subjects than other 
individuals, except perhaps under optimal con- 
ditions. The increased suggestibility of the 
hysteric, the importance of which is agreed 
upon by many workers, is only one of the fac- 
tors contributing to the hypnosis; other fac- 
tors are the motivation of the subject, his past 
and present attitudes toward hypnosis, and his 
relationship to the hypnotist. With the psy- 
choleptic group, the factors of secondary gain 
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which the seizures furnish, repressed aggres- 
sion toward the examiner, etc., appear to over- 
shadow any intrinsically greater degree of sug- 
gestibility. This finding is not unique, for 
various studies have failed to link hypnosis ex- 
clusively with hysteria, probably because of the 
complexity of the hypnotic phenomena as well 
as the dubiously hysterical populations upon 
which the hypnosis was attempted. 

The biserial correlations of Table 2, with 
psycholepsy-epilepsy as the forced dichotomy, 
are referable to the assumption that so-called 
psychogenic seizures and epileptic seizures are 
symptoms of a combination of factors which 
operate on a continuum from “almost pure” or- 
ganic “causation” to “almost pure’ psycho- 
logica' “causation.”’ The two groups do differ 
in terms of group averages, but the overlap 
in the psychological factors is considerable. In 
addition to the obvious imperfections of the 
measuring instruments, it is also clear that 
there is no true dichotomy of psychological in- 
volvement, but that the groups probably dif- 
fer because the subjects are drawn from dif- 
ferent points on the continuum. Gastaut’s 
findings, previously mentioned, clearly bring 
out the increasing importance of the concept 
of continuous degrees of convulsive suscepti- 
bility. The present study, in a gross fashion, 
approaches the problem from the opposite as- 
pect in its suggestion of psychopathology par- 
alleling the degree of convulsive suscepti- 
bility; ordinary or normal psychological ad- 
justment goes with a low convulsive threshold 
(shown by the abnormal EEG record), and 
increased psychopathology goes with a higher 
convulsive threshold (normal EEG record). 
This interpretation could be put to more pre- 
cise test by determination of the convulsive 
threshold in the manner described by Gastaut, 
and by correlation of the degrees of convulsive 
susceptibility with degree of psychopathology. 
The extent to which each factor, organic or 
psychological, is involved might furnish the 
basis for determination of treatment. It seems 
most difficult to attempt a differentiation of 


personality factors as “cause” of the seizures or ' 
as “results,” since any personality maladjust- ° 


ment may serve as a source of stress for the or- 
ganism, and continued stress may even con- 


tribute to increased susceptibility, if only 
through learning and not organic change. 


Summary and Conclusions 


Two groups of seizure patients were com- 
pared for evidence of psychological differences. 
In one group, termed epileptic, the seizures 
were diagnosed as symptoms of disturbed brain 
functioning. In the other group, termed psy- 
choleptic, the seizures were diagnosed as symp- 
toms of psychological maladjustment. 

It was hypothesized that the psycholeptic 
group would (a) repress, (4) be hypnotizable, 
(c) react to frustration in an impunitive man- 
ner, and (d) show evidence of psychological 
tension, to a greater extent than the epileptic 
group. The hypothesis was sustained for (a) 
and (d). The psycholeptic group did not dif- 
fer from the epileptic group on impunitiveness 
as hypothesized, but was significantly less ex- 
trapunitive. This last finding was interpreted 
as indicating that instead of suppression of all 
aggression, only suppression of outwardly di- 
rected aggression is involved. No differences 
between the groups on degree of hypnotizabili- 
ty were demonstrated. 

In general, the results indicate that certain 
patterns of psychological reaction are more prev- 
alent in seizure patients in whom no evidence 
of disturbed brain function can be found. The 
personality pattern of the psycholeptic patients 
is consistent with the theoretical pattern for 
hysterics. There is also tentative evidence that 
the psycholeptic group deviates from the nor- 
mal with respect to the variables of repression 
and reaction to frustration, whereas the epilep- 
tic group does not appear to differ from the 
normal in these respects. 

There is thus some support for the differ- 
entiation of seizure patients as organogenic or 
psychogenic, but the considerable overlap be- 
tween the groups on the variables studied sug- 
gests that the differentiation is not a clear-cut 
one, and that it applies mainly to the extremes 
of a psychogenic-organogenic continuum. It is 
suggested that different convulsive thresholds 
may be positively related to different degrees 
of the psychological maladjustment which 
serves as the stress adequate to precipitate the 
seizure. 
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Research Origin and Construction of the I. P. A. T. 
Junior Personality Quiz 


Raymond B. Cattell and Halla Beloff~ 


Laboratory of Personality Assessment and Group Behavior 
University of Illinois 


Alike in predicting school progress and in 
child guidance work a need has long been felt 
for an adequate and comprehensive personality 
measuring instrument with which to supple- 
ment ability tests. Within the adult range, 
researches on the nature and measurement of 
whatever stable, independent dimensions of 
personality can be shown to exist [3] have re- 
sulted in practical instruments for the meas- 
urement of personality source traits, such as 
the 16 Personality Factor Questionnaire (the 
16 P. F. test [11]). But it is or!y recently 
that the personality factor structure of 10- to 
14-year-old children has become sufficiently 
clarified, in terms of behavior ratings [9], 
questionnaire responses [10], and objective 
tests [8], to permit any truly informed test 
construction. 

The present article deals with the develop- 
ment of a children’s personality questionnaire, 
paralleling the adult 16 Personality Factor 
Questionnaire [4, 11] and based on the pub- 
lished factorization of children’s questionnaire 
responses. For although the final aim of the 
researchers in this laboratory has been to de- 
velop objective types of tests, it seems desirable 
to make available immediately to applied psy- 
chology a measure of these primary factors. In- 
deed, since the objective test battery for 12 
factors is likely to require about four hours of 
testing, it seems that there will in fact always 
be situations where practitioners will be com- 
pelled by practical restrictions to use the more 
convenient questionnaire form. 

In considering this last point—motivational 
questionnaire distortion—one should bear in 
mind, however, that in the present J.P.Q. as 
in the adult 16 P.F., the questions are, in gen- 
eral, indirect and are not accepted at their face 
value as self-assessments. Q data in our studies 


are no longer on the footing of introspective 
evidence but are treated as strictly behavioral 
responses, the meaning of which is to be found, 
not from the overt verbal meaning, but from 
correlations with observed behavior in real life 
situations. 

No description will be given here of the 
data which are being related to the question- 
naire responses, for they are set out elsewhere 
[8, 9], but a brief recapitulation will now be 
made of the research origin of the test, namely, 
of the factor analysis of the questionnaire re- 
sponses upon which the present test construc- 
tion rests. 


Synopsis of the Factor Analytic Foundation 


Since our intention was to arrive at a ques- 
tionnaire of nearer 100 than the usual 200 
items—on account of children’s limited powers 
of attention—we planned,to begin with about 
300 items, which could then be subjected to 
about two-thirds attrition by the selection pro- 
cess. These were chosen (a) to cover the whole 
field of behavior, i.e., not merely neurotic, etc., 
areas; (4) to be of interest to children; (c) to 
correspond in many cases to items in the adult 
16 P.F., being thus already known to measure 
particular factors. Hence, by apportioning 
about ten such equivalent questions to each 
adult factor, we hoped to succeed in cross- 
matching childhood and adult factors by 
“marker variables” alone. A 295-item ques- 
tionnaire was ultimately presented to 330 boys 
and girls between the ages of 10 and 13 (aver- 
aging 11.2 years),using dual-choice answers. 
This set of items was reduced in number by 
the following selection processes. 


1. All items were dropped which children found 
dificult to read, or of uncertain meaning. 
2. All items were dropped in which the yes/no 
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cut was more eccentric than 70%-30%. 
3. All items were dropped which had no signifi- 
cant correlation with any other item. 


These eliminations left 103 questions, which 
were factorized in a single matrix and rotated 
successfully to simple structure with respect to 
14 factors. At this point, with the factors de- 
fined, 115 newly constructed items, which also 
passed the first two of the above criteria, were 
correlated on the same population in an “ex- 
tension of the matrix’ by Dwyer’s method 
[14], to provide more items for some factors 
which seemed likely to have less than 12 signif- 
icantly loaded items each. In this way about 
4 or 5 new items were picked up for each fac- 
tor, giving a total of 218 factored items. 

From this matrix of 218 items factored into 
15 factors we aimed to build up the required 
junior personality questionnaire instrument. 


Selection of Factors for Inclusion 


The decision as to which factors should be 
set up in the standardized test was made on 
the following considerations: 


1. Does the factor have large variance, i.e., is 
there a sufficiency of items having significant load- 
ing to make a minimum of a dozen good items for 
the factor measurement? In the end, three of the 
last five factors in order of variance were dropped 
for failure in this respect. 

2. Does the factor have adequate split-half relia- 
bility? To assess the internal consistency of the fac- 
tors chosen by the simple structure the 218 items 
were readministered to a fresh population of 76 
children. In this calculation of the split-half co- 
efficient for each factor the scoring was made only 
on items above a loading of 0.20 on the factor con- 
cerned—averaging about 14 items to the factor. 
These values ranged from practically zero to 0.83 
and the lowness of some suggested that further work 
is required to improve the simple structure of the 
chosen factorization—a task best performed by in- 
dependent rotation of a completely new experimen- 
tal study. In the end the following factors were 
dropped (from the present questionnaire construc- 
tion) on this second basis (i.e., falling below a con- 
sistency r—0.41): 10, 13, and 14. Factors 4, 6, and 
7, which also fell below the 9.41 limit, were, how- 
ever, retained because they were highly satisfactory 
in regard to other criteria, and we suspected that 
sampling error in this relatively small group could 
have underestimated their true consistency. 

3. Could four psychologists independently exam- 
ining the items collected for each factor arrive at 
common, verbally consistent psychological descrip- 
tions—and perhaps even interpretations—of the fac- 
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tors discovered? This is not an attempt to put new 
wine in old bottles. As far as interpretation is con- 
cerned, if the factor unity does not fit pre-existing 
clinical or literary psychological 
functional unities, such a finding is accepted prin- 
cipally as a reflection on the validity of the latter. 
But, at the descriptive level, on the other hand, 
one has increased confidence in the factor if the 
same, consistent central characteristic is 
to it by different perusers of the collected loaded 
items. Factors 12 and 14 were the only ones in 
which differences of description and conception oc- 
curred, and this caused these, when combined with 
other defects, to be rejected. 

4. Is the simple structure convincing in terms of 
clearness of hyperplane, reaching a level of 60% 
or more items in the +-0.10 hyperplane? 


conceptions of 


assigned 


The combined application of these criteria at 
a severe level would cut the questionnaire 


down to 5 or 6 factors. Among those lost 
would be some clearly recognizable, known 
factors in the adult realm, such as J (Sensitive 
emotionality), Q,; (Will and H 


(Cyclothymia) as well as some less confidently 


control), 


labeled factors which are nevertheless of large 
variance in terms of accounting for child be- 
havior. It would seem a mistake to eliminate 
from a routine test, simply because split-half 
reliability is not yet high, any factor which 
could contribute (with refinement of items) to 
increasing educational prediction or clinical 
prognosis. For if and when these predictive 
powers are confirmed through use and external 
validation of the test, the reliability coefficients 
can readily be raised by increasing the number 
of items and by conventional item analysis. 
And since expense does not permit the publica- 
tion of two tests—one, for routine practice, 
containing only the 6 factors of highest clarity 
and reliability, and another, for further re- 
search, of lesser internal consistency but involv- 
ing all 15 newly adumbrated factors—it 
seemed best to compromise, by drawing the line 
at those 11 factors in which all the above cri- 
teria are tolerably satisfied simultaneously. 

Thus we decided to set up, in the standard 
questionnaire, factors 1, 2, 3, 4, 5, 6, 7, 8, 9, 
11, and 15, leaving the remaining 3, adequately 
described elsewhere [10], to await further 
basic research. 


Questionnaire Design and the Consolidation 
of Factors 
With the aim of convenience in use the fol- 
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lowing considerations of design directed the 
choice and arrangement of items for the above 
eleven factors in the actual questionnaire. 

1. A uniform 12 items per factor was aimed 
at and proved to be feasible for all factors, 
since only on 1 or 2 items on two or three fac- 
tors did this force us slightly below the chosen 
lower limit of loading of 0.25. (The per item 
average is naturally much higher.) Twelve is 
not a very large number of items on which to 
measure a factor, but it is as much as is com- 
patible with having a questionnaire which can 
be completed in a class period and within the 
endurance of a ten-year-old child. Higher relia- 
bility is best achieved, as we are planning, by 
constructing a second equivalent scale also with 
12 items per factor. Our first step was there- 
fore to examine the rotated matrix [10] and 
make tests of about the 24 highest-loaded items 
for each factor—counting on a 50% rejection 
rate by further considerations. 


2. We desired to avoid spurious correlations 
among the factor measurements and aimed to 
achieve this (a) by never using one item in 
more than one factor, (6) by balancing plus 
and minus loadings otcurring on an irrelevant 
factor among items chosen for high loadings 
on the factor concerned. This involves a good 
deal of art in combination. For example, item 
34 has its highest loading, —.39, on Factor 4, 
but also loadings of —21 on 15, 26 on 7, and 
26 on 9. These are substantially “cancelled” by 
including in Factor 4, (a) Item 146 (.37 on 
4) with a loading of .28 on 15; (4) Item 79 
(—38 on 4) with a loading of —23 on 7; and 
(c) Item 203 (33 on 4) with a loading of 
—24 on 9. 

This cancellation—such that no loading on 
an extraneous factor was repeated with the 
same sign, i.e., unsuppressed, in an item chosen 
for the given factor—was achieved on all but 
Factor 7 (slight excess of —8) and Factor 1 
(slight excess of +6, —7, and —8). In these 
last cases, where the nature of the intruding 
factor was well known, an attempt was made 
to eliminate them psychologically by molding 
the verbal expression of the item so as not to 
involve the contaminating tendency. 

3. Lastly, response position and affirmative- 
negative tendencies were balanced within each 
factor. There was initially some predominance 
of “Yes” answers operating in the direction of 
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the high factor scores. Where the item could 
not be inverted without damage to its meaning, 
it was left untouched and an “a” or “bd” re- 
sponse in the same factor was inverted instead, 
thus balancing on position of responses. 

Except for this latter concession, made in 
three factors, both position (first and second) 
and affirmative-negative responses were simul- 
taneously balanced in contributing to the factor 
score. Whenever inversion of the sentence or 
meaning had to take place, the opportunity was 
taken to employ the modification to make the 
item agree better with the meaning of the fac- 
tor as it had finally come to be understood. 
Such changes were kept slight, and were car- 
ried out only by a person with long experience 
in this field. 

4. Inspection of the 132 surviving items sug- 
gested that in a few cases the secondary pro- 
cesses of (2) and (3) above might have re- 
sulted in taking items not altogether adequate 
in primary loadings. Moreover each factor still 
contained, on an average, two items of its 
twelve with loadings initially between 0.25 and 
0.30, contrary to our hope of setting the lower 
limit of 0.30 in the final questionnaire. Accord- 
ingly it was decided to set up a new experiment 
to recruit fresh “replacement” items for those 
inadequate and below 0.30. For this purpose a 
further experimental “extension” of 128 new 
questionnaire items was set up, the items being 
specially created, on the basis of newly gained 
psychological insights through the factor analy- 
sis. 

These were administered to 120 eleven- and 
twelve-year-old boys and girls, along with the 
questionnaire as it then stood at 132 items, 
and a second “extension matrix” was calcu- 
lated showing the correlations of the 11 factors 
with each and all of these new items. When- 
ever a new item correlated 0.30 or more with a 
factor, without correlating significantly with 
any other factor, it was used to replace one of 
the lowest 2 or 3 items in the existing factors. 
In this way 28 improved items were introduced 
replacing those deemed inadequate. 

This final administration gave another op- 
portunity to ensure that no modified or 
overlooked item could remain that would be 
difficult of comprehension by ten-year-old chil- 
dren, and in two cases slight simplifications 
were found possible. 
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Table 1 
J. P. Q. and 16 P. F. Correlations 














16 P. F. Test Factors 





J.P.Q. Factors A B Cc. 28 ee. & I GM 8 0 @ Ge Oa 
1 51* —38 —30 
2 -48 ty 2 68* -39 59 
3 -—60 -51 38 38 62 —38 67 
a —30 48* 44 
5 -—23 43 -53 49 
6 32 36 
7 32 63* 
8 31 39 36 33-39 38-47 
9 34 
10 36° 
11 35* —33 
12 35* 











* Highest r’s in row and column. 


5. The factors were finally arranged with 
these items in cyclical order in the question- 
naire, for ease of scoring on the answer sheet, 
with a stencil. The items of easiest decision 
and reading were arranged at the beginning for 
“warming up.” 

6. For uniformity with the adult 16 P.F., 
and because the inclusion of a brief measure of 
the intelligence factor along with other factors 
in the latter has already proved highly con- 
venient in many practical situations, it was 
decided to include a brief intelligence test here 
also, as a twelfth factor. For this purpose 12 
items were taken from Cattell’s earlier valida- 
tion and standardization of verbal intelligence 
tests [1, 2], being chosen to subtend the range 
from 10 to 15 years with maximum discrimina- 
tion. The total number of items in the test is 
thus raised to 144, which, at a maximum al- 


lowance of 1/4 minute per item, should still be 
within the usual length of a class period. A 
time-of-working distribution for various ages 
will eventually be worked out. 

7. Although the ultimate interpretations and 
labeling of these factors must await their cor- 
relations with all kinds of child behavior and 
performance, some preliminary indication of 
their nature could be gained by correlation 
with the adult 16 P.F. test factors, which al- 
ready have many associations. Accordingly the 
J.P.Q. was administered along with the 16 
P.F. to 100 16-year-old highschool boys and 
girls in the U.S.A. and 56 15-year-old second- 
ary school children in Britain. Only the r’s 
above 0.30 are recorded in Table 1. 

To approach a cross-identification of factors 
from the data yet available in Table 1 is not 
a simple step. For we have to take account first 











Table 2 
J. P. Q. Factor Intercorrelations 
1 2 3 4 5 6 7 ~ 9 10 11 12 

1 
2 35 
3 18 44 
a 12 +10 =31 
5 -.18 18 39 8-37 
6 -02 -05 -=.22 24 -.10 
7 -13 =-29 -18 -03 -19 25 
8 17 01 -.27 29 -—.38 17 -04 
9 -.25 18 26 -.21 31 -09 -05 -42 
10 -—30 -17 -12 -.05 12 09 09 -,09 .07 
11 17 06 15 —-14 09 -.04 04 -23 -.09 .26 
12 .02 11 = =.09 13° —=.06 01 -12 08 -10 -08 -.04 
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of the corrections for attenuation justified by 
the marked brevity of the factor measures; 
second of the fact that any factor measure is 
at present still contaminated with others; and 
thirdly that even the pure factor measures 
themselves are slightly correlated. The com- 
bined effect of the second and third of these 
can be appraised from Table 2, representing 
the correlations among the 12 factors of the 
J.P.Q., as measured by the final test form 
upon a new population of 100 twelve-year- 
olds. 


The pattern of intercorrelations has a slight 
resemblance to the pure interfactor correlations 
obtainable from the matrix [10] but in the 
main the intercorrelations are larger and evi- 
dently due to some items in each factor still 
being appreciably loaded in other factors too. 
Further research will be directed to purifying 
factors by replacing these items. Meanwhile 
we know that the highest intercorrelation is 
0.44 and some 46 of the 66 interrelations are 
below 0.20, which is below significance (1%) 
for this size of sample. Parenthetically it is 
noticeable that the significant correlations— 
0.44 between Factors 3 and 2, and 0.39 be- 
tween 3 and 5—are confirmed by these factors 
having similar profiles in their correlations 


with 16 P.F. factors in Table 1. 


To seek cross-identification we may first 
make a rough correction for attenuation ef- 
fects’ by taking the r’s in Table 1 that are si- 
multaneously highest in their row and column, 
and marking this with an asterisk. This gives 
a match of J.P.Q. Factor 1 with 16 P.F. 
Factor J and, similarly, 2 with O, 3 with C 
and Q, combined, 4 with G, 5 with Qs, 7 
with H, 10 with F, 11 with 4, and 12 with 
B. Thus in 9 of the 12 new factors there is no 
alternate 16 P.F. factor which would com- 
pete with these interpretations, while in the 
remaining 3 the identification must be con- 
sidered indeterminate, 6 possibly matching N, 
8 possibly L, and 9 possibly M. The one un- 


satisfactory conclusion above is in having to 


1A more precise correction for attenuation can be 
obtained from the reliability coefficients reported 
earlier [10], which (Spearman-Brown corrected) 
are, for the factors in order: (I) .71, (II) .83, 
(IIT) .64, (IV) .36, (V) .63, (VI) .26, (VII) .38, 
(VIII) .39, (IX) .54, (X) 44, (XI) .56, (XII) 
-54 on an eleven-year-old population. 


match 4 with a combination of C and Qs. 
(though its resemblance in meaning to both 
was actually noted, before correlation, in the 
former article [10]). C and Q, are alike in 
being labeled the neuroticism factors in the 
adult-population studies and in having some- 
what similar effects on social situations, ¢.g., 
in most potently reducing an_ individual’s 
chances of being elected a leader [ 12}. 

The complexity of the manifest correlations 
in Table 1 beyond what would be expected 
from a simple one-to-one match, could arise 
from (a) the above brevity-attenuation effects ; 
(6) the factor structure in preadolescent chil- 
dren being actually different from that of 
adults; (c) the intercorrelations of the equiv- 
alent child and adult factors being different; 
(d) the measures being contaminated in one or 
both scales; and (¢) the two scales, adequate 
in their proper age ranges, being distorted in 
the intermediate age where we try to make 
them meet. 

Only further experience and research can 
clarify these issues. As indicated, a study is in 
progress to give an independent contribution to 
the J.P.Q. rotation, admittedly not perfect, 
and to replace contaminating items, which 
should truly align the J.P.Q. and 16 P.F. 
factors now matched. On the other hand, with 
respect to the difficulty in J.P.Q. Factor 3 
(with two marked r’s in its row in Table 1) 
it is psychologically conceivable that two pat- 
terns of adult neuroticism develop out of a 
single “‘protoneuroticism’”’ factor in children in 
the latent period, which factor is perhaps de- 
rivable from the adult data as a second-order 
factor. Parenthetically, the fact that in two 
cases the r’s accepted for matching fall as low 
as 0.35 should not worry us when the whole 
configuration of correlations is sound. Actually, 
one of these two lowest is between intelligence 
in the J.P.Q. (Factor 12) and intelligence in 
the 16 P.F. (Factor B), each constructed of 
a dozen standard test items! If test brevity, 
and application of both tests to the limits of 
their ranges, can do this to such a well-known, 
tried test factor as general intelligence, then 
the r’s averaging 0.6 among the newer person- 
ality factors are quite compatible with a match- 
ing r of unity between the two factor measures 
of these personality dimensions, with length- 
ened scales from the present tests. 
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Summary 


1. A multiple-factor, simple-structure factor 
analysis has been carried out (with extension 
devices) on 330 ten- to fourteen-year-old boys 
and girls on 423 groomed questionnaire items 
carefully chosen to represent the important di- 
mensions of behavior in the total personality 
sphere. 

2. By dropping 291 items unsatisfactory for 
eccentricity of cut, ease of comprehension, or 
goodness of loading, as well as items for 4 
factors of defective variance or stability, a 
questionnaire of 144 items measuring 12 fac- 
tors by 12 items each has been constructed 
(adding 12 items for intelligence). By corre- 
lation with the adult 16 P.F. these factors are 
revealed to correspond with such well-known 
factors as general intelligence (B), general 
neuroticism (C and Q,), emotional sensitivity 
(7), adventurous cyclothymia (7), anxiety 
level (O), will control (Q;), super-ego 
strength (G), etc., though slight adjustments 
are still required for proof of full alignment. 

3. The usual test-presentation requirements 
of balancing ‘“‘yes’”’ and “no” answers within 
each factor, balancing suppressor loadings for 
irrelevant factors, eliminating undue vocabu- 
lary and reading-time difficulties, arranging 
items in simple cyclical patterns for scoring 
ease, and presenting simple instructions and ex- 
amples for test and machine score-form use, 
have been worked out. The final J.P.Q. test 
(Junior Personality Quiz [12]) has 144 
items,; is suitable for children of 10-16 years; 
and permits completion in 30 minutes by 90% 
of children (but may require up to an hour for 
a small minority). 

The external or social validity of this inter- 
nally validated test remains to be investigated 
by use of the test in schools and clinics. How- 
ever, as Cronbach [13] and others have point- 
ed out in terms of information theory, the fact 
that the test deals with so many demonstrated- 
ly independent dimensions of behavior argues 
for the whole giving better predictions than 
any single, long, and reliable scale, because 
each factor brings new information. Thus the 
use of only 12 items for the intelligence test 
and of the remaining 132 for eleven further 
dimensions should increase clinical and educa- 
tional predictions decidedly above that obtain- 
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able from using this number of items (and the 
rest of the available testing time) in increas- 
ingly meticulous measurement of, say, intelli- 
gence alone, or in some personality test not 
demonstrated to deal with more than one or 
two factors. 

A temporary standardization has been issued 
with the test and a further age and sex stand- 
ardization on 2000 cases is in progress. Social 
extension of the meaning of the factors is also 
at hand in studies in press [6, 8, 11] relating 
the questionnaire factors to behavior rating and 
objective test factors obtained for the same 
population.’ 


®The writers wish to express their thanks to the 
teachers and psychologists of Mooseheart and of 
the Decatur School System, for providing popula- 
tions for experimentation. 


Received February 16, 1953. 
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Any study dealing with anxiety must show 
its recognition of the two-headed aspects of the 
concept and make clear which of the many pos- 
sible definitions is being considered in the 
study. May [7] has pointed out that anxiety 
can be considered either as a theoretical motiv- 
ating factor, present by inference rather than 
by direct observation, or as a descriptive term 
for a syndrome of physiological and psycho 
logical tensions. It is primarily the latter that 
will be considered in the present study, and 
the term anxiety stress will be used to difter- 
entiate it from the term anxiety used as a con- 


struct in motivation theory. 


Most of the newer texts, together with the 
majority of recent literature in clinical psy- 
chology, have tended to emphasize the impor 
tance of dealing with dynamic personality 
qualities such as anxiety, guilt, ego-strength, 
etc., for personality predictive purposes. Corre- 
spondingly, less and less interest is being placed 
on standard diagnostic or nosological terms 

h as psychopath, manic, anxiety neurotic, 
etc. Projective techniques have been especially 


valuable in focusing attention on these dynamic 


qual 
developed to measure a quality such as anxiety 

ress directly. Most of the efforts of the pro- 
jectivists have been bent toward discovering 
the underlying qualities of the personality, and 
the anxiety discussed is almost always anxiety 
by inference. 


ities, yet few of these techniques have been 


Wechsler and Hartogs [11] have developed and 
used short tests of motor skills, level-of-aspiration 
tasks, Draw-a-Person tests, etc. to differentiate be- 


1Data for this study were obtained by the Penn- 
sylvania State College research group, whose origin- 
al studies were published in Group Report of a Pro- 
gram of Research in Psychotherapy, Pennsylvania 
State Coll., Psychotherapy Research Group, 1953. 


tween anxiety neuroti conversion hysterics, ob 
sessive-compulsives, involutional depressives, and 
incipient schizophrenics Although man of these 
measures show some real promise once the quanti 
fication problem is solved, they are still in the ex 
perimental stage 

Much more popular clinically is the use of tec! 
niques already designed for other purposes to 
measure anxiety stress. Rashkis and Welsh [8] de 


veloped a means of detecting anxiety by use of the 


Wechsler-Bellevue scale; Elizur [1] has produced 
a content-scoring techr i { oT coring the R YT 
schach for anxiety and hostility; and a recent stud, 


[3] seems to confirm the 
tation. 
Probably the ti 


tain a measure of anxiety stress 


promise of such an aday 
use 1 to yb 
MMPI. '] 
scale by having 
judges choose from the MMPI those items they « 


side re ! as 


t pop ilar technique 
is the 

lor [9] constructed an anxiety 
indicative of 
these 


manifest anxiety. An ite 
f item 
Winne [13] constructed a 


items 


analysis of items pared the number « 


on the final scale to ) 


neuroticism scale with derived from th 


called neurotic triad which differentiated at the 1 
level of confidence between 140 neurotics in a VA 
mental hygiene 
normals. A 


tionship 


clinic and a matched group of 1 
study [5] measuring the re 
Winne and Taylor scal 
found them to have a correlation approaching ity 


Other attempts [4, 6] have been made to distin 


recent 


between the 


guish the amount of 
combinations of the traditional 
stead of attempting to new anxiety 
scale. Welsh [12] has approached the problem of 
profile analysis of anxiety with the most sophisti 
cation and has developed two measures, 
Index Internalization Ratio, 
to differentiate effectively between 


anxiety in a patient by using 
MMPI scales 


dev lop a 


an Anxiety 


and an which seem 


a wide variet 


of clinical and normal populations. 

Problem. The purpose of the present study 
is to determine if there have been changes in 
the anxiety-stress level, as measured by MM- 
PI scales, of a group of college students, con- 
comitant with the application of client-cen- 
tered therapy. 

It is rather difficult to get even tentative 
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agreement among various schools of thought 
on the goals of therapy; however, practically 
all disciplines mention the diminution of anxi- 
ety stress as an important subgoal. Certainly 
the removal of anxiety without a subsequent 
change in the ability of the individual to adapt 
is of limited value. It is equally true that with- 
out anxiety-stress reduction there is little real 
chance that a patient will be able to adapt 
himself more efficiently. It was felt by the 
writer that a constructive approach toward the 
merging of objectives of various schools of 
therapy would be obtained by measuring the 
effectiveness of each school in attaining the 
measurable subgoals that they claim to achieve 
by their methods. 


Hypotheses. 


1. College students undergoing client-cen- 
tered therapy will show significant decrease 
in anxiety stress following such therapy. 


2. There should be a relationship between 
reduction in anxiety and therapy success. 
Therefore, a relatively high correlation would 
be expected between reduction in anxiety- 
stress scores and success in therapy. 


Method 


Subjects. The 42 subjects involved in this 
study were all students of the Pennsylvania 
State College who came to the Psychological 
Clinic between September of 1949 and August 
of 1950, either by request of various agencies 
or of their own volition, to obtain aid in their 
personal adjustment. All these subjects were 
counseled by advanced graduate students in 
clinical psychology who had been trained in 
client-centered therapy. No attempt was made 
to diagnose these cases prior to therapy ex- 
cept that persons judged as prepsychotic or 
psychotic on the basis of an intake interview 
were transferred to the staff psychiatrist. The 
median number of therapy interviews was be- 
tween five and six, although some cases were 
seen for a much longer period of time. 

Procedure. A battery of tests consisting of 
the Rorschach, MMPI, and the Mooney 
Problem Check List was given to each of the 
subjects before the beginning of the therapy 
interviews or, at the latest, before the second 
therapy interview. The posttherapy tests were 
given after an agreement was reached by the 
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therapist and client that 


should end. 


therapy contacts 


The measures used in the present study 
were: 


1. The Taylor Anxiety Scale. This scale was not 
used in its entirety since many of the present sub- 
jects finished only 366 of the total MMPI items. 
This necessitated using only 34 out of the total of 
50 items. 


2. The Winne Neuroticism Scale. This scale was 
included because of its high imputed relationship 
with the Taylor scale, and to find out if “neuroti- 
cism” in this case was synonymous with anxiety 
stress. 

3. The Welsh Anxiety Index. This scale was in 
reality a combination of the standard scales already 
in use on the MMPI. This score was obtained by 
the formula: 

Hs + D+ Hy 


Al- + (D 





+ Pt) (Hs + Hy). 


3 


Since this scale requires no scoring other than 
the original scales of the MMPI, it could obviate 
the use of the Taylor and Winne scales if its pre- 
dictive ability were the same or better than those 
scales. 

4. The Welsh Internalization Ratio. 
obtained by the following formula: 


Hs + 
IR ee, 
Hy + Pd + Ma 
This measure was included to see whether the In- 
ternalization Ratio would change in the expected 
direction, toward externalization. As can be seen by 
the formula, the higher the score the greater the 
internalization.? 


This 


was 


The standard ¢ test of significance was used 
to determine the significance of the differences 
between the pretherapy and posttherapy meas- 
ures. The differences on each of the preceding 
scales were correlated by the Pearson product- 
moment r method with a multiple criterion for 
success in psychotherapy to see which of the 
measures predicted best to that criterion. Cor- 
relations were also calculated between the in- 
dividual measures which went to make up the 
multiple criterion and pretherapy to postthera- 
py differences on the anxiety-stress scales. 

The therapy criteria measures. A multiple 
criterion for evaluating client-centered psycho- 
therapy with college students was developed 


2The scores obtained by this population on the 
traditional scales of the MMPI were given in a 
previous article by the present writer [2]. 
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by Tucker [10] using the cases in the present 
group. This multiple criterion consisted of four 
measures : 

1. A 29-item therapy rating scale scored by 
the therapist following the end of contact with 
the client. 

2. The same therapy rating scale scored by 
judges who read at least 60% of the tran- 
scribed interviews, including the first and last 
interviews. 

3. A 14-item client rating scale which the 
client filled out at the time he took the post- 
therapy test battery. 

4. The P/N ratio, which was the ratio of 
negative to positive feelings in the final inter- 
view as compared to the ratio of negative to 
positive feelings in the first interview. 

These four criterion measures were com- 
bined into a total criterion score by weighting 
each measure according to the ratio of its 
summed intercorrelation with the other three 
measures. All the rating scales were constructed 
especially for this study by members of the 
group research project. For further informa- 
tion concerning the reliability of judges’ rat- 
ings, the construction of rating scales, etc., the 
reader is referred to Tucker's original work 


f10). 


Table 1 


Differences between Pretherapy and Posttherapy 
Anxiety-Stress Scores 











Measure N Mean SD 
Taylor Anxiety 
Pretherapy 42 17.28 6.18 
Posttherapy 42 13.76° 6.13 


Winne Neuroticism 


Pretherapy 41 9.85 4.28 

Posttherapy 41 7.97* 3.82 
Welsh Anxiety Index 

Pretherapy 42 80.60 24.78 

Posttherapy 42 70.40* 24.60 
Welsh Internalization 

Ratio 
Pretherapy 42 105.76 15.86 
Posttherapy 42 98.41° 15.53 





*Difference between pretherapy and posttherapy scores 
sicnificant at 1% level of confidence. 


Results 


1. Table 1 shows that all the scales used in 


the present study reveal a difference in the pre- 
dicted direction between pretherapy and post- 
therapy scores that is significant at the 1% 
level of confidence. Of particular interest is 
the change in the Welsh Internalization Ra- 
tio in which the group shows a definite trend 
towards externalization following client-cen- 
tered therapy. 

2. Table 2 reveals the correlations between 
anxiety-stress change scores and the therapy- 
success criterion measures. As might be ex- 
pected, the various anxiety-stress measures 
show a moderate to high agreement among 
themselves. It is clear that they are measuring 
somewhat the same quality, but they could 
hardly be used interchangeably on the basis of 
the present findings. Of particular interest is 
the high correlation between the Taylor scale 
and Winne scale confirming previous findings. 
One might think that this could be accounted 
for on the basis of considerable item overlap 
in the two scales. However, a check revealed 
that only six of the 34 items on the Taylor 
scale also appear on the Winne scale. 


3. The correlations between anxiety stress 


changes and the therapy-success measures 
show considerable variation. The Taylor scale 
seems to agree with these measures most ade- 
quately, although the Winne scale shows a sig- 
nificant agreement with every measure but the 
client rating scale. 

4. The Welsh Anxiety Index and Internali- 
zation Ratio did not fare so well, showing rela- 
tively low agreement with many of the criteria 
measures. One is forced to conclude from this 
that the scales designed specifically to measure 
neuroticism and anxiety stress are better pre- 
dictors of therapy success than are combina- 
tions of previously used MMPI scales. 


Discussion 


The results of the present study seem to 
show rather clearly that there is a decrease in 
anxiety stress concomitant with client-centered 
therapy. As was stated previously, this result 
will have more meaning when placed beside 
other studies which show behavioral changes 
and perceptual differences as a result of thera- 
py. Also important would be the same type of 
comparison done by members of an opposing 
theoretical and therapeutic orientation. 








Table 2 
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Intercorrelations of Anxiety-Stress Changes 


and Criteria for Therapy Success 














Winne Welsh Welsh Multiple Therapist Judges Client P/N 

Measure Neuroticism Anxiety Internalisation Criterion Rating Rating Rating Ratio 

Index Ratio : Par X Fs es 
Taylor Anxiety 62 49 38 54 35 47 A8 Ad 
Winne Neuroticism 30 58 54 51 A5 .21 A2 
Welsh Anxiety Index 71 28 08 28 42 21 

Welsh Internalization 

Ratio 40 13 33 16 37 





Note.—For .05 level of confidence, r= .855; for 01 level of confidence, r= .456. 


One may reasonably ask if the high correla- 
tions obtained between anxiety scales and 
therapy criterion measures were not due large- 
ly to the therapy measures depending upon ob- 
servation of this very stress reduction. On the 
client rating scale and the P/N ratio this 
might be of relatively high consequence; how- 
ever, on the therapist and judges’ rating scales 
the emphasis in judgments was placed on de- 
velopment of insight, planning behavior, un- 
derstanding of self, etc. Thus we can say with 
some confidence that these correlations are a 
result of concomitant. rather than similar be- 
havior. 

Summary 


An attempt was made to see if there were 
anxiety-stress changes, as measured by various 
MMPI anxiety scales, from pretherapy to post- 
therapy, in 42 college students who underwent 
client-centered therapy. Comparisons were al- 
so made between the change in stress measures 
from the pretherapy test to the posttherapy 
test, and the various therapy-success criterion 
measures. 

It was found that, although all four meas- 
ures showed a significant decrease in stress 
from pretherapy to posttherapy, two of the 
measures, the Taylor Anxiety Scale and the 
Winne Neuroticism Scale, showed the highest 
amount of agreement with the therapy-success 
measures. 


Received March 30, 1953. 
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The Taylor Anxiety Scale [26, 27 | is a self- 
rating scale which has shown rather consistent 
relationships with specific independently de- 
fined behavior variables [5, 6, 7, 9, 10, 14, 17, 
18, 19, 20, 23, 24, 26, 28, 29, 30]. This scale 
(referred to hereafter as the 4 scale) prior to 
its most recent revision [27] consisted of 50 
MMPI items selected by judges as being in- 
dicative of manifest anxiety according to a de- 
finition taken from Cameron [3]. 

Even though scores on the 4 scale have been 
shown to possess considerable predictive value 
in a number of situations, it is possible that 
they may be influenced by various response sets 
that are not closely related to anxiety as orig- 
inally defined by Taylor. One such extraneous 
variable may be the desire of Ss to place them- 
selves in a socially favorable light. Thus Ss 
may consider not only whether 2 given state- 
ment is actually true of themselves, but also 


whether their answers will make them look 
well. 


Several findings suggest that A-scale scores may 
be influenced by the variable of social favorability. 
Among these are the positive skewness of the distri- 
bution of scores of the A-scale standardization 
group consisting of 1,971 introductory psychology 
students who took the A scale between 1948 and 
1950, and the correlation of —.74 between the 4 
scale and the K scale of the MMPI based on the 
responses of 281 freshmen at the State University 
of Iowa. More direct support for the supposition 
that the expression of anxiety may be considered 


1Modification of a dissertation submitted to the 
faculty of the Department of Psychology of the 
State University of Iowa in partial fulfillment of 
the requirements for the Ph.D. degree. The writer 
is greatly indebted to Drs. H. P. Bechtoldt and I. 
E. Farber, under whose direction this investigation 
was completed. 


*Now at the Child Guidance Center, Des Moines, 
Iowa. 


socially undesirable has been found in 


lished preliminary investigation.® 


an unpub 


One method of dealing with the response tenden 
cies associated with considerations of social 
ability is to estimate their influence on test 
as is done in the K and L scales of the MMPI 
[16]. Another method attempts to reduce the pos 
sible effects of differences in the social acceptability 
of items on the test scores by use of the forced 
choice technique [8, 22, 25, 31, 32]. On the basis 
of available evidence Cronbach [4] has concluded 
that the forced-choice technique can be made rela 
tively 


favor- 


scores 


influence of 
sponse sets, offering the possibility of increasing the 
predictive value of tests in which it is used. The 
technique 


free from the extraneous re 


involves the “grouping of alternative 
choices to make them appear of equal value and 
have unequal significance” [1, p. 424]. In a forced 
choice scale of anxiety Ss might be required to 
choose between self-descriptive statements presumed 
to be relevant to degree of anxiety and nonanxiety 
statements having equal value with respect to social 
favorability. The 
cerned with the construction and evaluation of such 
a forced-choice scale of anxiety. 


present investigation was con 


Procedure 


Construction of the forced-choice anxiety 
scale. Two sets of anxiety items were used in 
the forced-choice scale. One set, hereafter re- 
ferred to as FC-1, contained the 50 items con- 
stituting the 4 scale. The other set of 50 items, 
hereafter referred to as FC-2, contained 
MMPI items not originally selected as anxiety 
indicators that showed a correlation of .41 or 
more with the total 4-scale score. The com- 
bined sets of 100 forced-choice items will be re- 
ferred to as FC-T. Items were defined as non- 
anxiety statements if they were rated by none 
or only one of Taylor’s five judges as descrip- 


8Cook, W. C., Wood, E. C., McAllister, J., & 
Zide, N. The favorability ratings by anxious and 
nonanxious subjects of the items in the Iowa Bio- 
graphical Inventory. Unpublished manuscript, State 
Univer. of Iowa, 1949. 
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tive of manifest anxiety and if they showed a 
correlation of .24 or less with the total 4-scale 
score. * 


In order to obtain favorability values for the 
anxiety statements and for the nonanxiety state- 
ments to be grouped with them, the 566 items in the 
group form of the MMPI and 34 items in the Wes- 
ley Rigidity Scale [30], a total of 600 statements, 
were rated on a five-point scale of social favor- 
ability by 108 students in introductory psychology. 
The Ss were instructed to assume that the answer to 
each of the items was “yes.” A rating of one repre- 
sented a judgment of wery favorable and five was 
very unfavorable. The mean favorability ratings of 
the 24 anxiety items that were stated negatively 
(e.g., “I don’t usually worry”) ranged from 1.38 
to 2.44, while the mean favorability ratings of the 
76 anxiety items that were stated positively (e.g., 
“IT am inclined to worry’) ranged from 2.72 to 
4.62.5 Thus the denial of the anxiety was consis- 
tently rated as more desirable than its admission. 

Among the factors to be considered in constructing 
and administering a forced-choice scale is the 
number of choices in a block. Stewart [25] has re- 
ported that resistance is encountered when Ss are 
required to choose from among uniformly unfavor- 
able self-descriptive statements. She suggested that 
this resistance may be overcome if both desirable 
and undesirable alternatives are included in a 
forced-choice block consisting of three or more 
choices, and if Ss are instructed to select a state- 
ment that is most descriptive of themselves and a 
statement that is least descriptive. Accordingly, 
each block in the present scale consisted of three 
statements: an anxiety statement, a nonanxiety 
statement matched with it for social favorability, 
and a second nonanxiety statement differing in social 
favorability from the two matched statements. None 
of the -an favorability ratings of the two matched 
statements differed by more than .10, while the sec- 
ond nonanxiety statement differed from the anxi- 
ety item by .80 to 1.25. If the matched statements 
were relatively unfavorable, the third statement 
was more favorable; if the matched statements 
were relatively favorable, the third statement was 
less favorable. Thus, within each block Ss were to 
choose among statements differing with respect to 
social acceptability and with respect to their desig- 
nation of anxiety by indicating the item most de- 
scriptive and least descriptive of themselves. 

The following triad is a typical example of a 
forced-choice block used in this investigation: 


*The relevant correlational data were obtained 
from an unpublished study by H. P. Bechtoldt, State 
Univer. of Iowa. 

5The 600 items that were rated for social favora- 
bility, the instructions to the raters, and the mean 
favorability ratings are presented in the original 
dissertation on file in the State Univer. of Iowa li- 
brary and in University Microfilms. 





Heineman 
Mean 
favorability 
Statement rating 
A. I have strong political opinions. 2.77 
B. I sometimes tease animals. 3.82 
C. I am a high-strung person. 3.88 


Item C is an anxiety statement matched for 
favorability with item B, a nonanxiety statement. 
Item A is a nonanxiety statement rated as more 
favorable than the two matched statements. The 
position of the statements within each block was 
varied by means of a table of random numbers. 


Statements were matched for favorability 
only if their frequency of selection by Ss as 
self-descriptive was approximately equal.‘ 
Several statements bearing on the same behav- 
ior (e.g., “I am a good mixer” and “I love to 
go to dances’) and mutually exclusive state- 
ments (e.g., “I almost never dream” and “I 
dream frequently”) were not used in the same 
forced-choice block. In addition, items that ap- 
peared to be predominantly statements of fact 
rather than reflections of attitudes or judg- 
ments (e.g., “In school I was sometimes sent 
to the principal for cutting up”) were not in- 
cluded in the forced-choice scale. Some state- 
ments (e.g., “I am a special agent of God”) 
were excluded because it was believed that col- 
lege students might not take them seriously." 

Scoring procedure. Two different scoring 
procedures were used. In one scoring method 
(Key 1) each of the three statements in a block 
was considered. Marking a positively worded 


®The probability of selection has been indicated in 
several studies showing the probability of “yes” 
responses to MMPI items and to items in the Wes- 
ley Rigidity Scale. Personal communications from 
Dr. L. E. Drake, University of Wisccnsin, and from 
Drs. W. G. Dahlstrom and H. P. Bechtoldt, State 
University of Iowa. 

7™The forced-choice scales and instructions are pre- 
sented in the original dissertation on file in the 
State Univer. of Iowa library and in University 
Microfilms. 


The MMPI group form numbers of the anxiety 
statements in FC-1: 7, 13, 14, 18, 23, 31, 32, 43, 67, 
86, 107, 125, 142, 158, 163, 186, 190, 191, 217, 230, 
238, 241, 242, 263, 264, 287, 301, 317, 321, 322, 335, 
337, 340, 352, 361, 371, 397, 407, 418, 424, 431, 439, 
442, 499, 506, 523, 528, 530, 549, 555. 

The MMPI group form numbers of the anxiety 
statements in FC-2: 3, 8, 51, 54, 57, 79, 94, 97, 102, 
122, 138, 146, 147, 148, 171, 172, 189, 201, 226, 234, 
236, 266, 267, 279, 284, 299, 304, 305, 309, 336, 344, 
356, 359, 374, 379, 382, 384, 389, 401, 402, 404, 411, 
414, 416, 421, 425, 479, 509, 541, 560. 
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anxiety statement as most self-descriptive was 
scored 1, and marking it as least self-descrip- 
tive was scored (0. When the two nonanxiety 
statements were marked and the anxiety state- 
ment was left unmarked, the anxiety statement 
was ranked between the two nonanxiety state- 
ments as being neither most nor least descrip- 
tive, so that the anxiety score was .5. When 
the anxiety statement was stated negatively, 
marking it as least self-descriptive was scored 
1, and marking it as most self-descriptive was 
scored 0; leaving the anxiety statement un- 
marked was scored .5. By assigning scores of 
0, .5, and 1, the possible range of scores was 
50 for both FC-1 and FC-2, so that the Key 1 
scores covered the same range as the scores on 
the original 4 scale. 


In the second scoring method (Key 2) all 
items scored 0 by the first method remained 0, 
and those scored 1 remained 1. When a posi- 
tively worded anxiety item was left unmarked, 
a score of 1 was assigned if the matched non- 
anxiety statement was marked as least descrip- 
tive, and a 0 score was given if the matched 
nonanxiety statement was marked as most de- 
scriptive. When a negatively worded anxiety 
item was left unmarked, an answer of least 
descriptive for the matched nonanxiety state- 
ment was scored 0, and a marking of most de- 
scriptive was scored 1. The possible range of 
Key 2 scores was also 50 for both FC-1 and 
FC-2. Both scoring keys provided three anxiety 
scores, one for FC-1, another for FC-2, and a 
third for FC-T, the sum of FC-1 and FC-2. 

Subjects. The forced-choice scale was ad- 
ministered to 209 Ss in introductory psychol- 
ogy who had not made the favorability ratings. 
Each of these 209 Ss had taken the Iowa Bio- 
graphical Inventory, which included the 4 
scale, about ten weeks prior to the adminis- 
tration of the forced-choice scale. 


Results and Discussion 


Relation between the forced-choice scale and 
the A scale. It was reasoned that if the corre- 
lations between the forced-choice form of the 
anxiety scale and the A scale did not differ 
significantly from the test-retest correlation ob- 
tained for the 4 scale, it might be concluded 
either that no extraneous source of variance 
due to the favorability factor is present in the 
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A scale, or that the forced-choice form has not 
led to its elimination. Table 1 presents the 
correlations between the 4 scale and the Key 
1 and Key 2 scores of FC-1, FC-2, and their 
total, FC-T. 


lable 1 
Correlations between the Taylor A Scale and Two 
Sets of Forced-Choice Items Treated Separately 





(FC-1 and FC-2) and in Combination 
(FC-T) on the Basis of Two Scoring 
Keys (N = 209) 

Correlated Scales Key 1 Key 2 
A, FC-1 i 60 
1, FC-2 68 58 

.74 .62 


1 FC-T 


The correlation between the 4 scale and 
the Key 1 score of FC-1, which consisted of 
the A-scale-items in forced-choice form, did 
not differ significantly from the previously re- 
ported [7] A-scale test-retest correlation of 
83 (t= 1.7%, df = 268). However, the cor- 
relation between the 4 scale and the Key 2 
score of FC-1 was significantly lower (p < 
01) than the A-scale test-retest correlation 
(¢ = 3.41, df = 268). Furthermore, the cor- 
relation between the 4 scale and the Key 2 
scoring of FC-1 was significantly lower (R = 
.001, t = 4.12, df = 206) than the correlation 
between the 4 scale and the Key 1 scoring of 
FC-1 

In the light of other data to be presented it 
seems reasonable to interpret these results as 
indicating that the effects of social desirability 
may have been reduced or eliminated entirely 
as a source of variance common to the 4 scale 
and to the forced-choice scale only when Key 2 
was applied to the responses in the forced- 
choice scale. 


Relations between forced-choice subscales 
and scoring methods. It will be recalled that 
the anxiety items used in FC-2 were selected 
on the basis of their high correlation with the 
A scale when presented in single-choice form. 
A cross validation of these relationships could 
be obtained by correlating the FC-1 and FC-2 
scores in the present study. 

The correlation between FC-1 and FC-2, as 
scored by Key 1, was found to be .79. A correla- 
tion of practically equal magnitude (.74) was found 
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when these two sets of items were scored by Key 2. 
In addition, the correlations between each of these 
two sets of items in forced-choice form and the 
A scale, as shown in Table 1, did not differ signifi- 
cantly, either for Key 1 scores (¢ — 1.71, df — 206) 
or for Key 2 scores (t = .54, df = 206). 

These results may be taken as evidence that a 
set of items selected on the basis of high correla- 
tion with the A scale in the single-choice form 
were also correlated highly in the forced-choice form. 
Thus, scores on the forced-choice form appeared to 
be a function of factors similar to those influencing 
scores on the single-choice form of the test. Since 
both FC-1 and FC-2 could therefore be regarded as 
measures of manifest anxiety, these two sets of 50 
forced-choice blocks were combined into a single 
scale of 100 blocks (FC-T). 


Reliability. In order to evaluate the relia- 
bility of the forced-choice scales the internal 
consistency estimates of reliability were ob- 
tained by the use of the Kuder-Richardson for- 
mula 21 [13] and are reported in Table 2.® 


Table 2 


Reliability Coefficients of Three Sets of Forced- 
Choice Items Based on Two Scoring Keys 











(N = 209) 

Set of Items Key 1° Key 2t 
FC-1 82 69 
FC-2 80 -70 
FC-T 89 83 





* Kaitz modification. 
+ Kuder-Richardson formula No. 21. 


The Kuder-Richardson estimate of reliability of 
the A scale, as computed for the sample of 209 Ss 
who also took the forced-choice scale, was .85. The 
estimated reliabilities of the Key 1 scores of FC-1 
and FC-2 were not significantly lower than .85. The 
significant differences between the reliability co- 
efficients involved the comparisons between the 4 
scale and Key 2, and between the two scoring keys. 


8In comparing the reliability of the A scale with 
that of the forced-choice scale the number of scor- 
ing categories for each item was considered. Both 
the A scale and the forced-choice scale, when scored 
by Key 2, allotted scores of zero or unity to the 
responses to a single item, so that the Kuder- 
Richardson estimate of reliability could be com- 
puted on the basis of 50 items. The scoring of the 
forced-choice scale by Key 1, however, involved 
three response categories which were assigned scor- 
ing weights of 0, .5, and 1. Accordingly, a modifi- 
cation of the Kuder-Richardson estimate of relia- 
bility developed by Kaitz [12] was utilized as 
more appropriate for Key 1 scores than the Kuder- 
Richardson formulas. 
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The 4 scale was significantly more reliable (p < 
.001) than the Key 2 scores of FC-1 (# = 5.14) 
and of FC-2 (¢ — 5.08). The Key 1 forced-choice 
scale scores were more reliable (f — .001) than 
the Key 2 scores for FC-1 (# = 7.31), for FC-2 
(¢ = 8.56), and for FC-T (¢ = 4.25). Thus the 
Key 2 scoring of the forced-choice scale yielded the 
least reliable measures. 

The lower reliability of Key 2 as contrasted to 
the A scale might be accounted for in terms of a 
reduction in social favorability as a factor common 
to the responses to the forced-choice form. The lower 
reliability of the Key 2 scores as compared to the 
Key 1 scores might be due to two factors: a de- 
creased influence of social favorability on the scores 
obtained by Key 2, and/or the reduction in the 
number of scoring categories from three in Key 1 
to two in Key 2, a reduction which would probably 
decrease the discrimination among subjects. 


Table 3 


Mean, Median, Standard Deviation, and Range of 
the 4 Scale and of FC-1, FC-2, and FC-T, 
Based on Two Scoring Keys 











(N = 209) 

Scale Mean Median SD X. Range 
A scale 13.68 13 7.66 1-34 
FC, Key 1 

FC-1 22.02 22.5 6.32 7.5-37.0 

FC-2 23.06 23.5 6.14 8.0-39.5 

FC-T 45.08 45.5 11.67 16.5-76.0 
FC, Key 2 

FC-1 25.58 26 6.22 9-42 

FC-2 26.91 27 6.39 9-42 

FC-T 52.49 54 11.82 12-78 





Distributions of scores. Table 3 shows the 
means, medians, standard deviations, and 
ranges of the scores for the 4 scale and for the 
forced-choice scale. The distribution of the 4- 
scale scores was shown to be positively skewed 
by a test of normality given by McNemar 
[15], the deviation from normality being signi- 
ficant at the .02 level of confidence (¢ = 2.35). 
The distribution of FC-T scores was normal 
for Key 1, and negatively skewed at the :05 
level of confidence (¢ = 2.03) for Key 2. 
Since the standard deviations of the 4 scale 
and of the Key 1 and Key 2 scores of the 
forced-choice scale were approximately the 
same, the elimination of the positive skewness 
characteristic of the 4 scale might be taken as 
an indication of the decreased influence of the 
social favorability factor on the forced-choice 
scale scores. 
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Table 4 
Correlation between the MMPI K Scale and Two 
Sets of Forced-Choice Items Treated Separately 
(FC-1 and FC-2) and in Combination 
(FC-T) on the Basis of Two 
Scorings Keys (N — 209) 











Correlated Scales Key 1 Key 2 
K, FC-1 ~.55 ~.36 
K, FC-2 -.57 -41 
K, FC-T -.57 42 





Relations with the K scale. As the K scale 
of the MMPI was developed to serve as an 
estimate of the influence of social favorability 
on test scores, the magnitude of the relation- 
ship to the K scale of scores obtained on the 4 
scale and on the forced-choice scale might also 
be used as a measure of sensitivity to the social 
favorability variable. In Table 4 is shown the 
relationship of the K scale with the 4 scale and 
with the forced-choice scale. 


The negative correlations between the K scale and 
the Key 1 and Key 2 scores of FC-T were signifi- 
cantly lower (f < .001) than the correlation of 
—.74 between the A scale and the K scale (¢ — 
5.04 for Key 1, 7.85 for Key 2, df = 206).® Thus 
the use of the forced-choice scale scored by either 
key resulted in a marked reduction in the influence 
of the social acceptability as measured by the cor- 
relation of the test scores with the A scale. This 
evidence of a reduction in the influence of social 
favorability was also found when FC-1 and FC-2 
were considered separately. 

The negative correlation between the Key 2 
score of FC-T and the K scale was significantly 
lower (f < .001) than the correlation between the 
Key 1 score of FC-T and the K scale (¢ — 5.90, 
df = 206). Thus the Key 2 scores appeared to be 
significantly less susceptible to the influence of social 
desirability than the Key 1 scores. 

The correlation between the Key 2 score of the 
forced-choice scale and the K scale still differed 
significantly from 0 (f < .001), which may be ac- 
counted for, in part, by the overlap in the forced- 
choice scale and in the K scale of 11 statements that 
contribute to the anxiety score when answered in 
one way and to the K-scale score when answered 
in the other. 


Bias experiment. The most direct method 


*The correlation between the A scale and the K 
scale was based on the responses of Ss who took the 
A scale about ten weeks prior to the administration 
of the K scale. The K scale was part of a battery 
of tests that was administered jointly with the 
forced-choice scale. It is not believed that the time 
interval affected the correlation between the 4 
scale and the K scale to any marked degree. 
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of evaluating the influence of a social desir- 
ability factor upon the various anxiety measures 
is to instruct Ss to answer deliberately in such 
a way as to appear in the most favorable light 
possible. A number of studies [2, 11, 21] have 
shown that scores on personality tests of the 
“ves-no” type are greatly influenced by such 
instructions. On the basis of the reasoning un- 
derlying the construction of the forced-choice 
scale, it would be predicted that it would not 
be so greatly influenced by biasing instructions. 

Sixty-four Ss from introductory psychology, 
who had not previously made favorability rat- 
ings or taken the forced-choice scale, were di- 
vided into two groups matched on the basis of 
their A-scale scores. About two weeks fol- 
lowing the administration of the A scale 
one group of 32 Ss took the 4 scale under in- 
structions to appear in the most favorable 
light possible, and the other group of 32 Ss 
took the forced-choice scale under similar in- 
structions. In order to evaluate the influence 
of biasing instructions it was necessary to ex- 
press the raw scores in comparable units. Ac- 
cordingly the forced-choice scale scores and 4- 
scale scores were transformed linearly into a 
distribution having a mean of 20 and a stand- 
ard deviation of 4. The coded means, medians, 
and standard deviations obtained by the 
matched groups of 32 Ss under conventional 
and biasing instructions are shown in Table 5. 
The following discussion is based on the coded 
scores. 


The mean of the faked A-scale scores. was sig- 
nificantly lower, at the .001 level, than the mean 
of the conventional A-scale scores (¢ — 6.31, df = 
31). The mean biased score of FC-T, Key 1, was 
also significantly lower, at the .01 level, than the 
mean conventional A-scale score (t—— 2.80, df 
31). On the other hand, the mean biased score of 
FC-T, Key 2, did not differ significantly either 
from the mean of the conventional A-scale scores 
(¢t = 1.78, df = 31) or from the mean conventional 
Key 2 forced-choice score of 20.00 of the standard 
sample of 209 Ss. These results demonstrate that 
the A scale and the Key 1 scoring of the forced- 
choice scale are subject to the influence of the social 
favorability variable, while the Key 2 scores appear 
to be relatively insensitive to defensive attitudes. 

Furthermore, the mean biased A-scale score was 
significantly lower (p < .01) than the mean biased 
score of FC-T, Key 1 (¢# = 3.60, df == 31). In 
turn, the mean bias score of FC-T, Key 1, was 
significantly lower (p < .001) than the mean bias 
score of FC-T, Key 2 (t — 5.53, df = 31). It is 
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/ Table 5 
Transforhhed Means, Medians, and Standard Deviations of Conventional and 
Biased A-Scale Scores, and of Biased Forced-Choice Scale Scores (FC-T) 





Scored by Two Keys for Two Groups of 32 Matched Ss 








A-scale group 





Forced-choice scale group 














A-scale A-scale A-scale FC-T, Key 1 FC-T, Key 2 
conv. bias conv. bias bias 
Mean 19.13 15.03 19.03 17.18 20.26 
Median 18.08 13.90 17.82 17.53 20.02 
SD 3.29 2.49 3.74 1.92 2.57 
apparent that the forced-choice scale, scored by 24 anxious and 19 nonanxious Ss were selected 


either key, was less sensitive to deliberate faking 
than the A scale. Nevertheless, forced-choice scale 
scores were more susceptible to faking when the 
second nonanxiety statement was considered in the 
Key 1 scoring. 


Additional considerations. The instructions 
used in the bias experiment were equivalent to 
instructions to rate items for favorability. 
Scoring by Key 2, under biasing instructions, 
provided a means of estimating whether state- 
ments of equal social favorability when rated 
singly were rated as equally desirable after 
they were grouped into forced-choice blocks. If 
the matched anxiety and nonanxiety statements 
were regarded as equally favorable, one might 
expect that each of the two statements would 
have an equal chance of being selected as the 
more favorable, so that on the average the 
mean of the faked Key 2 scores of the forced- 
choice scale of 100 blocks would be 50. The 
mean faked Key 2 score of FC-T, not coded, 
was 53.22, which differed from a hypothetical 
mean of 50 between the .02 and .05 levels of 
confidence (¢ = 2.36, df = 31). Thus the 
anxiety statements were selected somewhat 
more frequently as being desirable than their 
matched nonanxiety statements. This circum- 
stance may account for the slight negative 
skewness of the distribution of the Key 2 scores. 
It may be concluded that the procedure of 
matching statements on the basis of their indi- 
vidual favorability ratings did not insure ex- 
actly equal favorability when the matched 
statements were put into forced-choice form. 


The effectiveness of a forced-choice scale of anxi- 
ety in reducing or eliminating social favorability 
as a variable affecting test scores may also be lim- 
ited if anxious and nonanxious individuals dif- 
fer in their estimate of the social favorability of 
anxiety statements. In order to test this possibility 


from the 108 Ss who had rated the MMPI and the 
Wesley Rigidity Scale items for social favorability. 
Individuals who scored in the upper 20% of the 4 


scale were defined as anxious, and those who 
scored in the lower 20% were defined as non- 
anxious. The favorability ratings of these two 


groups of Ss were compared with respect to the 
50 anxiety from the A _ scale and their 
matched nonanxiety statements. The 37 positively 
stated items in FC-1 were considered 
separately from the 13 negatively stated items.?° 

The difference scores between the favcrability 
ratings of the anxious and 
computed for each of the 37 positively stated anxi- 
ety items and for each of their matched nonanxiety 
items. The anxious Ss rated these 37 anxiety state- 
ments as significantly more favorable, at the .001 
level, than did the nonanxious Ss (t — 4.70, df = 
72). Applying the same procedure to the 13 nega- 
tively stated anxiety items and to their matched 
nonanxiety items, it was found that anxious Ss 
rated the anxiety items as significantly less favor- 
able (p < .001) than did the nonanxious Ss (¢ = 
5.97, df — 24). These results indicate that anxious 
Ss regard the expression of anxiety less unfavorably 
than do nonanxious Ss. 


items 


anxiety 


nonanxious Ss were 


Since anxious Ss differed from nonanxious 
Ss in their favorability ratings of anxiety 
statements, the influence of social favorability 
was probably not entirely eliminated from the 
test scores regardless of the form of presenta- 
tion and of the scoring method. However, the 
results of this investigation indicate that the 
influence of social desirability on test scores 
could at least be drastically reduced by the use 
of the forced-choice method. 


10Favorable ratings of a negatively worded item 
had the same meaning as unfavorable ratings of a 
positively worded item. For instance, when Ss gave 
a relatively favorable rating to a _ negatively 


worded item, such as “I don’t usually worry,” they 
considered the behavior of worrying as undesirable. 





A Forced-Choice Form of the Taylor Anxiety Scale 


Summary 


The present study was concerned with the 
development of a forced-choice form of the 
Taylor Anxiety Scale designed to reduce the 
effects of possible tendencies by Ss to consider 
the social desirability of particular responses. 
The forced-choice form consisted of three 
statements, an anxiety statement and a non- 
anxiety statement of comparable social favor- 
ability, and a second nonanxiety statement 
differing in social favorability from the two 
matched statements. Favorability ratings were 
obtained for 566 MMPI items and for 34 
items from the Wesley Rigidity Scale. In 50 
forced-choice blocks (FC-1) the items consti- 
tuting the original 4 scale were used as anxiety 
statements, and in 50 forced-choice blocks ( FC- 
2) items which exhibited a high correlation 
with total scores on the original 4 scale were 
used as anxiety statements. The Ss were re- 
quired to choose one statement in each triad as 
most descriptive and one as least descriptive of 
themselves. By use of Key 1, scores were as- 
signed according to the ranking of the anxiety 
statement as most or least descriptive relative 
to the two nonanxiety statements, while in Key 
2 the ranking of the anxiety statement relative 
only to its matched nonanxiety statement was 
scored. This forced-choice scale of anxiety was 
administered to 209 Ss who had taken the 4 
scale about ten weeks previously. 


The following results were obtained: 


1. The correlation between FC-1 and the 
A scale was significantly lower that the 4- 
scale test-retest correlation when Key 2 was 
used, but not when Key 1 was used. 


2. Items selected on the basis of their high 
correlation with the 4 scale in the single-choice 
form were also correlated highly in the forced- 
choice form. The scores obtained on FC-1 and 
FC-2 could therefore be combined into FC-T. 


3. The A scale and the Key 1 score of the 
forced-choice scale were more reliable in terms 
of internal consistency than the Key 2 score of 
the forced-choice scale. 


4. The positive skewness of the distribution 
of the 4- scale scores was corrected when the 
forced-choice scale was used. 

5. The negative correlation with the K 
scale of the MMPI was highest for the 4 
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scale and lowest for the Key 2 score of the 
forced-choice stale. Both differed significantly 
from the Key | score of the forced-choice scale. 

6. One group of 32 Ss took the 4 scale un- 
der instructions to appear as favorable as pos- 
sible, and another group of 32 Ss, matched with 
the first group on the basis of their A-scale 
scores, took the forced-choice scale under the 
same biasing instructions. While both the 4 
scale and the Key 1 scores of the forced-choice 
scale were lowered by deliberate faking, this 
reduction was greater for the 4 scale. The Key 
2 scores of the forced-choice scale were not af- 
fected by biasing instructions. 

7. Statements matched on the basis of social 
favorability were not of equal value after they 
were put into forced-choice form. 


Anxious subjects differed from nonanxious 
subjects in their estimate of the social desir- 
ability of anxiety statements. This was taken 
as evidence that the influence of social favora- 
bility may not have been entirely eliminated 
by the use of the forced-choice technique. How- 
ever, the results of this investigation indicated 
that the influence of social desirability on test 
scores could at least be drastically reduced. 


Received March 23, 1952. 
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In an earlier study, Guertin [3, 4] found 
some factor analytic evidence for three types 
of paranoid schizophrenics, but the general do- 
main of schizophrenia investigated was too 
broad to bring the smaller area of paranoid 
schizophrenia into clear focus. The three tenta- 
tive factors suggested were the persecuted-sus- 
picious, the superior-grandiose, and the over- 
ideational. In a transposed factor analysis of 
a portion of these data [5] a large first factor 
was obtained which could be identified as para- 
noid schizophrenia but again the sample was 
too broad to provide a focus solely on paranoid 
schizophrenia. 


The present report is on a propaedeutic 
study designed to explore the possibility of 
further subtyping paranoid schizophrenics by 
sampling the behavior of only paranoid schizo- 
phrenics and subjecting the results to a trans- 
posed factor analysis. 


Method 


The study is based upon 24 hospitalized 
males who were diagnosed as paranoid schizo- 
phrenics by the hospital staff. In all cases the 
investigator looked for intrinsic or extrinsic 
evidence of paranoid psychosis at the time of 
testing in order to avoid including any subjects 
who might be in complete remission. The 
twenty-fifth subject is a hypothetical “normal” 


individual derived from normative data in the 
MMPI manual. 


It is interesting that an earlier study independ- 


1This work was conducted at Beatty Memorial 
Hospital, and the report completed at and released 


from Veterans Administration Hospital, Knoxville, 
Iowa. 


ently employed this “normal” individual in a sim- 
ilar fashion [1]. It was necessary to exclude over 
one-half of the original paranoid sample selected 
for this study for various reasons. Among the orig- 
inal 51 subjects, only two had to be discarded be- 
cause they were uncooperative and hostile. This 
would suggest that there was an adequate sampling 
of hostile-reacting individuals. Five were excluded 
because they appeared to be in good remission while 
four had to be discarded because of foreign-lan- 
guage background which made communication difh- 
cult. Two more individuals were discarded because 
their distribution of true and false answers left too 
small a proportion in one of the tails. It was felt 
that no serious harm was done to the representa- 
tiveness of the sample because of these exclusions. 
However, it was necessary to exclude 13 individuals 
who were too confused as demonstrated by their 
clinical behavior or their unreliability on the test. 
This suggests that the only really important biasing 
of the final sample was in the direction of the ex- 
clusion of paranoids with appreciable confusion. 

It was necessary to employ some technique of 
evaluating the individuals on a number of test 
items in order to make a transposed analysis. The 
choice lay among three areas: enumeration of symp- 
toms, checklist of behavior, and avowed attitudes 
and interests. Since the MMPI was designed to 
include sampling of significantly deviant paranoid 
behavior, it was felt that these cards might provide 
a group of items which would prove useful for 
sampling the avowed attitudes and interests of the 
paranoid schizophrenics. The authors independently, 
and later jointly, went through the total 550 MMPI 
cards and selected those items which they agreed 
probably would be most discriminating among dif- 
ferent types of paranoid schizophrenics. In general, 
there was good agreement on these selections and 
both investigators concurred independently on the 
final 100 items employed. An additional fifteen 
items from the L scale of the MMPI were included 
to provide some statistical control over the distortion 
of answers. 


The subjects were told that this was a research 
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study and that their assistance would be appreci- 
ated. They were assured that the information ob- 
tained was not for their hospital folders but would 
be used for scientific purposes. Each subject was 
asked to sort the 115 cards into the true and false 
boxes according to whether he felt that the state- 
ment was true for him or not. All subjects, with the 
exception of two, complied very well with this and 
it was unnecessary to do much forcing of items into 
one box or another. After the 115 items had been 
sorted, each subject was asked to re-sort the 15 
items from the L scale in order to provide a simple 
reliability check. With these controls it was possible 
to say that the results were reliable and that some 
external criterion was available to indicate the va- 
lidity of the sorting of the 100 test items. The mean 
number of L items answered in the proper direction 
was 10.48, thereby indicating relatively little dis- 
tortion operating for this sample. The mean number 
of agreements upon retest with the L items of 13.24 
is also high. 


Tetrachoric intercorrelations among individ- 
uals were obtained from fourfold contingency 
tables and charts. The resulting intercorrela- 
tion table was factored by the multiple-group 
centroid method. Communalities were esti- 
mated by employing the largest correlation 
within a cluster for-those individuals within 
a cluster. Those who lay outside of clusters 
had communality estimates based upon the 
highest intercorrelation in the whole column. 


Results 


The intercorrelation matrix, which will not 
be presented here in order to conserve space, 
disclosed a great many medium high as well as 
high correlations with positive signs. Negative 
signs were largely absent. In doing a cluster 
analysis, it became obvious that one very large 
group factor was predominant. After extract- 
ing two group factors simultaneously, it was 
decided that a complete centroid check should 
be made to see whether the first of these ob- 
lique factors had pulled out sufficient variance. 
A complete centroid through the reduced in- 
tercorrelation matrix extracted a factor which 
accounted for 55% of the estimated com- 
munality. In contrast to this, the first group 
centroid factor had accounted for 45% of the 
estimated communality. It was felt that this 
group factor would be more meaningful and 
accounted for sufficient communality to be re- 
tained. Further cluster analysis was made on 
the residual matrix obtained after the first two 
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oblique factors were extracted. After four 
identifiable clusters had been located, the final 
factoring was begun. Mutiple-group centroid 
extraction of these four factors simultaneously 
produced the oblique factor matrix reported in 
Table 1. Inasmuch as no further clusters of 
individuals could be determined and since 85% 


Table 1 


Factor Matrices for Subjects 








Subj. Oblique factors* 








Orthogonal 

no. representation 
A B [ D I oe: oe 
17 88 | a ee 88 00 05 £00 
25 S4 40 78 46 34+ 06 33-06 
12 os. % 37 83 39 21 13 
y 80 22 49 45 80 -12 03 -04 
21 73 32 56 32 73 02 14 -15 
6 68 39 27 53 68 12 ~-22 19 
15 53 03849 45 2 “i es 
+ 53 21 32 44 53-01 00 16 
3 51 32 #61506 =| 645 51 12 -24 17 
2 30 80 39 41 30 74 08 36 
13 2¢ 75 47 -04 26 71 22 -18 
23 6 71 11 01 36 62 -30 -17 
14 29 58 53 37 29 51 33 28 
16 3 Ff ss 42 @© @ 
1 05 34 30 06 05 35 -09 05 
8 64 56 3 2&4 64 33 45 -19 
24 27 is Fw i 27 O8 80 -11 
5 72 56 72 48 2 So 
11 46 62 69 S51 46 47 41 32 
22 7 23 © 2 57 -13 43 -19 
20 Bb hit & Ss © 3 
7 “6 2 Ss 72 43 -06 13 55 
18 “a2 6 2 @ 26 -05 09 63 
10 qe b&b 2 2 47 -05 -09 29 
19 328 oS Ss 2 FW =a 
Zaij2 7.19 2.76 2.02 1.64 

* Decimal points omitted. 


of the estimated communality had been already 
extracted, it was decided to discontinue the 
factoring at this point. The clustering of the 
individuals in a relatively small “test space,” 
which was determined predominantly by the 
first oblique factor, is seen in the intercorrela- 
tions among the factors. Only the correlation 
between factors B and D was negligible. The 
correlations of factors C and D with 4 were 
of the order of .6, while correlations of the 
order of .4 were found between factors 4 


and B, B and C, and C and D. Subject No. 25, 
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who was the normative one obtained from the 
MMPI, shows interesting loadings on the 
factors and determines to a large extent the 
identity of Factor A. 
Discussion 

The estimated communality among the para- 
noid schizophrenics in this sample of 63.3% 
of the total variability of all the individuals’ 
test behavior gives good support for the com- 
monness of people within this diagnostic cate- 
gory. It is seen from this statistic that they 
are more similar in behavior on these test 
items than they are different. The four group 
factors extracted accounted for 85% of this 
communality and 54.4% of the total variance 
of the individuals. This means that by refer- 
ence to these four personality types, more than 
50% of the variability of these individuals can 
be described. 


Socially Normal Paranoid Type (A) 


This is the type of paranoid who makes a good 
social impression in his behavior and makes a diag- 
nosis of paranoid schizophrenia difficult. Usually, 
diagnoses of such a paranoid psychosis depend upon 
obtaining a valid history of delusional action in a 
single rather restricted area. This factor or person- 
ality type might be described as a normalcy factor 
inasmuch as individuals who are of this type cor- 
respond closely to the answers given by the norma- 
tive individuals of the MMPI (subject No. 
This factor seems to be largely a common species 
factor as described by Cattell [2], i.e., they are 
similar in that they are human beings. The only 
consistent deviant response they give is wondering 
if they locked the door and closed the windows 
when they are away from home. Among their nor- 
mal responses, there is nothing particularly striking. 
Individuals who approximated this type, almost 
without exception, gave indications of restricted de- 
lusions cn just a few of the 100 MMPI items. These 
few markedly abnormal responses appearing for a 
subject did not have sufficient weight to prevent 
him from being predominantly of this type. 

The nine most typical type A individuals gave 
the deviant answer: worries about doors and win- 
dows. Nondeviant answers were: not happy alone, 
hot an important person, not being talked about, 
never feels like smashing things, not obsessed with 
bad words, home life pleasant, no strange or pecu- 
liar sensations, no one stealing ideas, not afraid of 
knives, not being followed, most people would not 
lie to get ahead, not inspired to life of duty, never 
kept diary, and never feels like going to pieces 

Grandiose and Delusional Paranoid Type (B) 


This type of paranoid is a very outgoing and over- 


25). 


productive person whose delusional 
largely in the grandiose area. People, to him, are 


objects of his environment without any emotional 


systems are 


importance; overideational trends are also present 
for this type. These individuals gave a mean num 
ber of true responses for the 100 items of 65.5 
mean of the other 


three groups of only 41 responses, seems to be clear 


which, as compared with the 


evidence of overproductiveness and expansiveness. 

The four most typical type B individuals gave the 
deviant answers as follows: strong political opin 
ions, agent of God, peculiar odors, being talked 
about, strange and peculiar sensations, unusual re- 
ligious experiences, inspired to life of duty, never 
felt better, most people would lie to get ahead, likes 


to know who to get next to on a 


»b, people make 


friends because they are useful, and wonders why 
people do things for him. Nondeviant answers were 
daily life full of interesting things, talks to stran 
gers, likes good 


O.K., likes science, and judgment never better. 


] 


serious lectures, leader, sex life 


Evasive and Well-Integrated Paranoid Type (C 


This type of individual seems to be one who re- 
tains a good integration and is able to conceal many 
of his delusions. Just what the 


type of delusi 


might be probably would vary among individua 
but they all seem to be rather successful in covering 
up their areas of difficulty. One is impressed b 

nature of their items. Re 
lated to this is the fact that they gave fewer tru 
answers than any of the other types. One gets the 
impression that this is the type of individual who is 
rather amoral and not too cooperative or productive 
in mental! status inquiries. 


uniform denials on test 


This type of individual 
would also make diagnosis difficult although the 
history should be more productive than for the type 
A who seems to have more restricted delusions. The 
relationship between this type and type A is ex 
tremely high, probably because they both 
high degree of personality integration. 
The five mc“t typical type C individuals gave the 
deviant 


retain A 


answer 
Nondeviant 


would not like expensive clothes. 
answers doesn’t want to know 
who to get next to on a job, don’t feel like failure 
when hearing of another’s success, never made to do 
things by hypnosis, not bothered by sex thoughts, 
safe to trust people, can be friendly with those 
do wrong, never disgusted over sex, parents made 
him obey, never about to go to pieces, likes science, 
seldom constipated, and nobody influencing mind. 


were 


Sensitive, Inadequate, and Withdrawn Type (D 

This type of individual seems to experience ex 
treme feelings of inadequacy and many neuroti 
type conflicts. He seems to be an immature perso 
who has tried unsuccessfully to please the parents 
a “bad 
boy.” Inhibitions and suppression of overt exprés- 
sion are paramount in this type of individual. We 
see this in the area of sexual as well as other inter- 
personal behavior. This individual does not seem to 


and has now come to evaluate himself a 
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be very delusional and it is seen that the correlation 
between the delusional type B and this one is quite 
low. It is questionable whether such individuals 
should be classified as paranoid inasmuch as they 
appear to manifest predominantly catatonic symp- 
toms. 

The three most typical type D individuals gave 
deviant answers: others know thoughts, strangers 
look critically, dirt disgusts, compulsive counting, 
obsessed with bad words, been in trouble over sex, 
bothered by sex thoughts, criticism hurts terribly, 
dislikes dances, never talks to strangers, dislikes 
serious lectures, dislikes science, people make friends 
because they are useful, people don’t like to help 
others, and wonders why people help him. Non- 
deviant answers were: conduct controlled by cus- 
toms, doesn’t want to know who to get next to on a 
job, keeps mouth shut when in trouble, thoughts 
come too fast to speak, could never benefit world, 
have felt better than now, sleeps without disturbing 
thoughts, not afraid of germs, never had unusual 
religious experience, and never kept diary. There 
were too few individuals in the last classification to 
make a good evaluation of the nature of the factor 
type and yet one feels that they do not share the 
common features that the first three factor types do. 


The present study is regarded as being too 
restricted to yield results which can be pro- 
posed as classificatory .types or criteria, but 
should serve as an illustration of the useful- 
ness of the transposed factor-analytic technique 
in deriving such systems. It was felt that the 
selected MMPI items proved to be very useful 
samplings of personality variables even though 
they have a subjective basis. It is the investi- 
gators’ opinion that these items can be em- 
ployed successfully in other factor-analytic 
studies of pathological groups that might be 
capable of responding reliably and validly, such 
as cases of organic brain disease. It does seem 
clear from these results that at least two sub- 
types of paranoid psychoses should be distin- 
guished: the first, the socially normal well- 
integrated individual, and secondly, the grand- 
iose delusional person. It is conceivable that 
these two phenotypes might represent different 
phases of the same disease process. 


Summary and Conclusions 


1. Much criticism has been made of Krae- 
pelinian classification, leading tc many con- 
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structive empirical investigations. The authors 
have felt that such empirical studies should be 
encouraged and that methodological experi- 
mentation will prove profitable in eventually 
establishing more effective classificatory sys- 
tems. ' 

2. This paper is a research report on a trans- 
posed factor-analytic study of 24 male para- 
noid schizophrenics and one statistically “‘nor- 
mal.” One hundred selected Minnesota Multi- 
phasic items were employed to provide samples 
of avowed attitudes and interests. Brief checks 
upon reliability and validity were incorporated. 

3. The factor analysis disclosed a large com- 
monness among the individuals in this study. 
Three personality types included under the 
classification of paranoid schizophrenia are sug- 


gested. They are as follows: 4 — socially 
normal paranoid, B — grandiose and delusion- 
al paranoid, and C — evasive and well-in- 


tegrated paranoid. A fourth factor type dis- 
closed is the sensitive, inadequate, and with- 
drawn type. This latter may correspond to 
the catatonic subtype of schizophrenia. 

4. In general there was a confirmation of 
the classification of paranoid, but at least two 
types are different enough to merit separate 
classificatory labels. They were the socially 


normal well-integrated person and the grand- 
iose delusional type. 


Received February 9, 1953. 
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It is occasionally of clinical interest , to 
know how a patient performs on a single con- 
stituent scale of the Minnesota Multiphasic 
Personality Inventory. Under such circum- 
stances the administration and scoring of the 
entire inventory seems an unwarrantedly long 
and cumbersome procedure. Furthermore, the 
conventional way of recording and scoring re- 
sponses tends to deflect attention away from 
the content of the individual responses and 
from the patterns of responses which contribute 
to the total scale scores. Material of manifest 
and potential clinical significance is thus too 
readily overlooked. Both limitations on the 
usefulness of the MMPI would be obviated if 
there were available paper-and-pencil forms of 
the separate constituent scales of the inventory. 

Preparation of forms for the separate scales 
involves the segregation of items pertaining to 
each of the scales. The subject is then required 
to respond to items which are in a sequence of 
related items and divorced from the total con- 
text of randomly distributed unrelated and “si- 
lent” items. What effect does this segregation 
and massing of items into separate constituent 
scales singly administered have on the validity 
of the resultant scores? The present study was 
designed to answer this question as it applies to 
the Psychopathic Deviate scale which has been 
reported [1] to have reasonably satisfactory 


validity when administered in the standardized 
card form. 


Method 


A mimeographed form of the Psychopathic 
Deviate scale was prepared with the 50 items 
of the scale arranged in the same order as in 


McKinley and Hathaway’s original listing 
[3]. By way of identification, each item was 
lettered and numbered to accord with its desig- 
nation in the manual [2]. To facilitate scor- 
ing, each item was assigned either an even or 
an odd number which was imprinted in front 
of the identifying designation so that a sub- 
ject’s score on the scale might be found by 
simply counting the even-numbered items 
marked “true” and the odd-numbered items 
marked “false.” The mean and standard devi- 
ation for normal males were imprinted at the 
bottom of the form for use in conversion of 
raw scores into J’ scores. Instructions to the 
subject and the first two and last three items of 
the scale follow as samples to illustrate the 
arrangement. 


Below are 50 statements about yourself. Read 
each statement carefully. If what it says is tru: 
you, draw a circle around the T at the end of t! 


line. If what it says is false for you, draw a ciré 
around the F. Start at the top of the page. Mark al 
statements. Work fast. 


1A4 I am neither gaining nor losing weight. T ! 
6B42 I have used alcohol excessively. T F 
3126 1am easily downed in an argument. T I 
4127 I find it hard to keep my mind on a task 
or job. r | 
8130 My way of doing things is apt to be 
misunderstood by others. ca 


Mean 14.00 Standard Deviation 4 


Subjects were 50 randomly selected patients 
admitted for observation to the psychiatric 
division of the Kings County Hospital. The 
median age of the subjects was 34 years, with 
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a range of 16 to 59. They were classified psy- 
chiatrically as follows: 


Chronic alcoholism .... 
Drug addiction . ; 
EE VEE OE 
Schizophrenia ...... “ 9 
Psychosis due to alcohol . 

Manic-depressive psychosis 
Psychoneurosis ........ ane 
Psychopathic personality 


ho 


= WO = 


sv S + 


Half the subjects were individually tested 
first with the paper-and-pencil massed form of 
the Pd scale and then, after one to three days, 
with the entire inventory in standard card 
form. The remaining half received the standard 
inventory first, followed, after a like interval, 
by the experimental form of the Pd scale. It 
was thus possible to check the validity of the 
experimental paper-and-pencil form of the Pd 
scale against the criterion of the Pd scale de- 
rived in standard fashion from the administra- 


tion of the MMPI as a whole. 


Results 


As shown in Table 1, the mean T score of 
the 50 subjects on the experimental form of the 
Pd scale is 64.24 as against a mean of 63.56 
on the standard form of the scale. The dif- 
ference is well within chance expectancy, / 
being equal to .62 for a two-tailed test. Rota- 
tion of the order of administration to equalize 
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for both forms any effect of practice or famili- 
arity seems to have been superfluous. The dif- 
ference between the means ‘or tests admin- 
istered first and for those administered second 
is exactly zero. Nor does it seem to matter 
whether the experimental form or the standard 
form is administered first. 


The Pearson correlation coefficient between 
experimental-form scores and standard-form 
scores was found to be .79. In view of the fact 
that a test-retest reliability coefficient of .71 is 
reported by McKinley and Hathaway [3] for 
the standard form of the Pd scale, it may be 
said that the paper-and-pencil form of the Pd 
scale is as valid, against the criterion of the 
standard Pd scale, as the standard Pd scale is 
itself reliable. Greater validity, of this derived 
type, cannot be expected. 

Discussion 

It appears that no loss of accuracy results 
from segregating the items of the Pd scale and 
administering them in massed sequence as a 
separate paper-and-pencil scale out of the con- 
text of the MMPI as a whole. Such cushioning 
effect as may normally be exerted by the “si- 
lent” items of the MMPI and by items not 
relevant to the Pd scale seems to be without 
consequence for the total score on the scale. It 
seems impressive that the equivalence of the 
out-of-context Pd scale with the standard form 


Table 1 


Comparison of T Score Means on the Standard and Experimental Forms 
of the MMPI Pd Scale 














Item N T scores 

Standard Pd scale 37—98 
50 

Experimental Pd scale 35—93 

First-test scores 37—93 
50 

Second-test scores 35—98 

Experimental scale 42—93 
25 

Standard scale 45—98 

Standard scale 37—86 
25 

Experimental scale 35—88 


Range of 





Mean SD Diff. t p 
63.56 

1.33 51 62 
64.24 
64.0 

1.33 0 1.00 
64.0 
67.24 

2.18 35 71 
66.48 
60.64 

1.57 38 73 
61.24 








Paper-and-Pencil Form of the MMP! Pd Scal: 


of the scale has been established with a quite 
deviate population. While direct evidence is 
lacking, this finding at least suggests the pos- 
sibility that others of the constituent scales of 
the MMPI may also fare well as separate 
scales. 

A number of advantages accrue to the avail- 
ability of the constituent scales which may be 
administered in separate paper-and-pencil 
form. The clinician’s attention is perforce di- 
rected to where it should be, viz., the subject’s 
actual Identical total scores on a 
given scale may stand for quite different pat- 
terns of actual behavior. Variations in item 
clusters, although irrelevant to total score, may 
be quite significant for personality evaluation 
and behavior prediction. Ready availability of 
the concrete responses provides provocative ma- 
terial which is useful in structuring follow-up 
clinical interviews. Finally, availability of the 
separate scales is a contribution to efficiency in 
those instances in which information on less 
than the total inventory is sought. 


responses. 


Summary 


1. A paper-and-pencil form of the MMPI 
Pd scale, prepared as a separate scale, was ad- 
ministered individually to a group of 50 psy- 
chiatric patients. Scores on this experimental 
form of the Pd scale were compared with 
scores obtained by the same subjects on the Pd 


scale when administered in standard form as 
part of the total MMPI. 
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.79 was obtained be- 
tween scores on the experimental form and 
scores on the standard form of the Pd scale. 
The out-of-context paper-and-pencil Pd scale 
may be said to be as valid, against the criterion 
of the standard the 


2. A correlation of 


form of the scale, as 
standard form of the scale is itself reliable. 

3. ‘The mean score for the experimental 
form was found to be not significantly different 
from the mean score for the standard form. 
Order of administration of the 
made no difference. 


two forms 

4. Availability of a separate paper-and-pen- 
cil form permits more efficient and flexible use 
of the MMPI, directs attention to clinically 
significant content, presents data useful in fol- 
low-up interview, and provides a permanent 
record of detailed concrete responses in readily 
utilized form. 


Rec. ived Mare h 2 ] 
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In a recent article in this journal, Albert El- 
lis seriously questions the value of the person- 
ality inventory [2]. He raises a series of objec- 
tions to the use of these inventories which do 
not seem fully justified to us. Although Ellis 
questions the value of all such inventories, he 
mentions only the MMPI by name. Since we 
cannot be certain which techniques other than 
the MMPI Ellis has reference to in his study, 
and since it is known that the various inven- 
tories show serious differences among them- 
selves in construction, reliability, and validity, 
we will confine our rebuttal to a defense of the 


MMPI. 


It is perhaps best that Ellis’ sweeping con- 
demnation of the personality inventories be 
answered point by point. His objections are 
centered around five arguments, the first of 
which is that “in most instances the inventories 
are not measuring the independent traits they 
are supposed to be measuring” [2, p. 48]. In 
support of this first point he cites 21 studies in 
which intercorrelations were calculated among 
the subscales of different inventories. Of these, 
20 studies showed significantly high correla- 
tions, indicating that the subscales were not 
measuring independent traits. Since Ellis does 
not give specific references, it was impossible 
to check which of the 21 studies applied spe- 
cifically to the MMPI. 

A check of the literature, however, shows 
only two studies in which intercorrelations of 
MMPI subscales were reported. Only one of 
these studies had been published before Ellis’ 
article appeared. Both studies, however, seem 
to support his contention that the subscales are 
correlated. But even a cursory inspection of 
the MMPI Manual [3] shows that no claim 
for freedom from intercorrelation has ever 
been made by the authors of the test. They 


1Now at Michigan State College. 
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point out that the correlation of the Sc scale 
with the Pz scale for normal cases is .84, but 
that it drops to .75 on abnormal cases. They 
state specifically that “The scale intercorrela- 
tions, which vary widely, will be difficult to 
interpret until more data are available on the 
dynamic inter-relationships of the different 
clinical syndromes” [3, p. 3]. 

It seems necessary here to make the elemen- 
tary statistical observation that although two 
traits have a correlation of .70, this still means 
that more than 50 per cent of the variance is 
not taken into account by the correlation. 
Thus, while the subscales on the MMPI may 
have statistically significant intercorrelations, 
this does not mean that they do not have clin- 
ical usefulness either as separate subscales or 
as a group in profile analysis. 


The second point that Ellis raises concerns 
the lack of agreement among the various in- 
ventories and their failure to correlate highly 
with the Rorschach and other projective tech- 
niques. Both the fact that there are such strik- 
ing differences in the construction, reliability, 
and validity of the various personality inven- 
tories and the fact that the MMPI is an em- 
pirically developed key cast serious doubt upon 
the propriety of using other inventories as a 
check on its validity. Space does not permit a 
discussion of validity studies of the Rorschach 
and other projective techniques, but to say 
that their validity has not yet firmly been es- 
tablished experimentally seems to err on the 
side of charity. 

The third objection raised by Ellis is that 
personality inventories are easily faked. Al- 
though he admits that the MMPI can partial- 
ly compensate for this, we felt that a thorough 
examination of the studies which pertain to 
faking would be of value. Using the biblio- 
graphy listed in the MMPI Atlas [4] for a 
guide, we were able to find nine relevant 
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studies, eight of which showed that faking 
could be detected at a statistically significant 
level. 

Ellis’ next objection is perhaps the most 
serious. He states that personality inventories 
“usually do not give significant group discrim- 
inations when used with vocational, academic, 
socioeconomic, and disabled and ill groups. It 
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was especially found that, in none of the areas 
in which they are commonly employed, do per- 
sonality inventories consistently show signifi- 
cant group discriminations’ [2, p. 48]. He 
supports this statement by an analysis of 499 
studies of personality inventories. Since he 
again fails to give references to the particular 
studies he used, we decided to examine all 


Table 1 
Significant Discriminations Found in Research Studies 
with the MMPI, 1940 through 1950 








Discriminations tested* 








Number of studies Number of studies 





Inventory scores vs. diagnostic 
examinations: 
Neuropsychiatric conditions 
Psychosomatic conditions 
Delinquent, criminal, and 
psychopathic behavior 


Alcoholism 
Predicting success in treatment 
Total 
Inventory scores vs. behavioral 
characteristics: 


Extent of social participation 

Tendency to become a sexual 
deviant 

Success at motor tasks 

Miscellaneous behavioral 
characteristics 


Total 


Inventory scores vs. vocational 
and academic test and 
performance results: 

Tendency to have specific 
vocational interests 

Membership in different voca- 
tional groups 

Success in actual vocational 


performance 
Success in academic achievement 
Total 
Inventory scores vs. group 
differences: 
Intellectual groups 
Sex groups 


Socioeconomic groups 

Disabled and ill groups 

Miscellaneous groups 
Total 


Crand Total - 








Total 
in which signifi- in which no signifi- number 
cant discrimina- cant discrimina- of 
tions were found _ tions were found studies 

14 4 18 
1 0 l 
5 1 6 
4 0 4 
9 0 9 

33 5 38 
4 0 o 
1 0 l 
1 0 1 
5 0 5 

11 0 11 
3 0 3 
2 2 4 
a 2 6 
3 0 3 

12 4 15 
3 0 3 
2 0 2 
3 0 3 
2 0 2 
5 0 S 

15 0 15 

71 9 80 





* This table is a duplicate of that presented by Ellis [2]. We have omitted “the se categories of his 
which we could find no relevant studies on the MMPI. 
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the published studies available to us which 
were relevant to the categories he lists in his 
table to see how the MMPI would hold up 
under this type of examination. Since the At- 
las includes most papers which make more than 
a passing reference to the MMPI from 1940 
to the end of 1950, and since the studies Ellis 
lists are from the period 1946 to 1951, we felt 
that most of the pertinent studies would be in 
this list. 


The data presented in Table 1 seem to show 
that Ellis is in serious error in his belief that 
the MMPI does not consistently show signifi- 
cant group discriminations and that, as Ellis 
puts it, different experimenters keep obtaining 
directly contradictory results when using the 
same or similar inventories. 

Ellis’ final criticism is, we think, the most 
provocative. He feels that when personality in- 
ventories are most effectively used, they tend to 
become equally as time-consuming as alter- 
native psychodiagnostic procedures (such as 
interviewing and the use of projective tech- 
niques) which seem to be more clinically in- 
cisive and valuable. This complaint reminds 
us of Dallenbach’s oft-quoted remark, ‘“‘Ease 
and convenience are poor experimental guides” 
[1]. It scarcely seems necessary to add that 
they also make for poor clinical criteria. 


This belief that the clinical psychologist can 
gain more pertinent, incisive, and “depth- 
centered personality” material from more in- 
tuitive techniques than from the MMPI calls 
for careful examination. Again space does not 
permit a lengthy discussion of the validity of 
these intuitive techniques. We would, however, 
like to mention briefly a recent study by Holtz- 


man and Sells [5]. 


These authors have presented us with an 
excellent study of the efficacy of present-day 
clinical techniques. They administered a bat- 
tery of tests to a group of aviation cadets. The 
protocols were then presented to some of the 
most outstanding clinical psychologists in the 
country. Some of the cadets had adjusted to 
the stress of flight training while others were 
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eliminated from the program because they de- 
veloped overt personality disturbances. The 
clinicians were asked to separate one group 
from the other on the basis of the protocols. 

The results of this experiment were that 
no clinician or combination of clinicians using 
any single test or any combination of tests 
could predict above the chance level.? But the 
most disturbing finding of this study was the 
interclinician variability. To read that one 
group of clinicians in general described a cadet 
as having strong ego strength or a good hetero- 
sexual adjustment pattern, and then to read 
that other clinicians felt in general that the 
same cadet evidenced latent schizophrenic 
trends or latent homosexual tendencies causes 
us to lose a good deal of our faith in clinical 
intuition and in the tests employed in this 
study. 

The results of the Holtzman-Sells experi- 
ment, when compared to the findings listed in 
Table 1, lead us to believe that Ellis might im- 
prove his clinical intuition by laying aside his 
projective techniques for a furtive glance at an 
MMPI profile. 


Received April 6, 1953. 
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Buros, Oscar Krisen. (Ed.) 
measurements yearbook. 
(220 Montgomery St.) : 
Pp. xxiv + 1163. $18.00. 


The fourth mental 
Highland Park, N. J. 
Gryphon Press, 1953. 


This volume covers the period 1948 through 1951, 
and attempts to list all commercially available tests 
—educational, psychological, and vocational—pub- 
lished as separates in English-speaking countries 
during this time. The list includes the staggering 
total of 793 tests, 130 more than the Third Yearbook. 
Many more personality tests have been included, 
together with a number of tests which are available 
only as a part of restricted testing programs. There 
are 596 original test reviews by 308 reviewers, 53 
excerpts from journal test reviews, and 4,417 ref- 
erences on the construction, validity, use, and lim- 
itations of specific tests. In addition, 429 books on 
measurements and closely related fields are listed, 
and extensive excerpts from reviews of these books 
are reprinted. This volume, as have been the pre- 
ceding ones, is magnificently organized, indexed, and 
cross-indexed. The complete bibliographies for each 
test are invaluable. Some criticisms of previous vol- 
umes have been well heeded, as the listed instruc- 
tions to the test reviewers, and their performance, 
indicate. The level of the test reviews is generally 
high, and for the most part judicious. A few re- 
views are possibly excessively bitter. Most possibly 
controversial tests have two or more reviewers, and 
a comparison of the reviews serves to control the 
problem of possible personal bias of the reviewer; 
study of the reviews of several different tests by 
one reviewer may also be very illuminating here. 
Any prospective test user can certainly learn quick- 
ly, authoritatively, and painlessly what tests there 
are for any purpose, and what he needs to know 
about any test, so far as the knowledge is avail- 
able anywhere. Would-be test constructors are urged 
to read and reconsider—maybe their brain child 
should be aborted. But if it must be brought to 
term, what should and should not be done is amply 
exemplified here. Read and heed!—4A. R. 





Note.—The reviews were prepared by the Associ- 
ate Editors, who may be identified by their initials. 
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Devereux, George. (Ed.) Psychoanalysis and the 
occult. New York: International 
Press, 1953. Pp. xv + 432. $3.00. 


Universities 


This is an anthology of widely differing views 
on “telepathic” and “clairvoyant” phenomena by 
psychoanalysts and psychoanalytically oriented writ 
ers. For the most part, the based on 
the authors’ observations of correspondences between 
thoughts of patients and analyst, or 
tient’s (or analyst’s) thought and external events. 
The positions range from heated disbelief 
explanations in terms of chance, suggestion, 
tion, faint sensory clues, etc., to fiery belief in the 
reality of thought transmission with arguments for 
nerve induction, identification, etc. 
antagonist each ascribe “deep unconscious 
to the other’s opposition. Calls for experimental 
validation by skeptics are often ignored by believ- 
ers, or sometimes scoffed at, as Fodor cries 
out: “Since when has the unconscious any respect 
for scientific evidence?” (p. 331). Meanwhile, edi- 
tor Devereux looks out over this bloody battlefield 
with cool and often brilliant detachment. His rel- 
ative impartiality is evident in the inclusion, in text 
and bibliography, of writers who stress the two ex- 
tremes as well as several middle positions. But, 
additionally, he writes two chapters of his own that 
are characterized by neither haughty skepticism 
nor uncontrolled acceptance. Thus, Devereux displays 
some of the self-discipline and relative lack of bias 
that Freud demanded of himself and followers, but 
that neither Freud nor many followers, as this book 
amply demonstrates, could quite muster.—E. L. K. 


Fliess, Robert. 


New York: 
Pp. 164. 


essays are 
between pa 


with 


projec 


Protagonist and 


sources” 


when 


The revival of interest in the dream 
International Universities Press, 1953. 


$3.00. 


Many years ago Freud compiained that analysts 
lacked interest in the dream and had little to add 
to dream theory. In this small volume a close stu 
dent of Freud presents abstracts of recent selected 
papers on the dream, classified according to clinical 
observations, applied dream interpretation, or ad- 
ditions to dream theory. He includes as 
own contribution to the meaning of the “spoken 
word” in the dream. In each instance, Fliess ap- 
pends his own critical note to the published papers, 
frequently pointing out that the points raised by 


well his 
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the contemporary writer were handled or at least 
recognized earlier by Freud. Indeed, Fliess seems 
unduly constrained to demonstrate that little has 
been contributed to dream theory since Freud’s writ- 
ings on the topic. If he does not entirely succeed 
in this, at least he shows conclusively that among 
analysts interest in the dream has been “revived.” 
This is a useful source book of recent dream studies 
and a provocative pro-Freudian critique of dream 


theory.—A. M. G. 


Hammond, Kenneth R., & Allen, Jeremiah M., Jr. 
Writing clinical reports. New York: Prentice- 
Hall, 1953. Pp. xii + 235. $5.35. 


Practicing clinicians, as well as clinical teachers 
and students, often find report-writing difficult and 
reports dissatisfying and ambiguous. This man- 
ual, written jointly by a psychologist and a mem- 
ber of an English department, provides some real 
assistance. The philosophy of report-writing, the 
problems of communication with sophisticated *d 
unsophisticated readers, and the effective use of 
technical and nontechnical vocabulary are particu- 
larly well handled. The sections on style and or- 
ganization seem less directly helpful, although ex- 
amples are consistently drawn from the psycho- 
logical field. Many of the points made, and all 
of the extensive illustrative case material, are the 
outgrowth of suggestions provided by staffs of many 
different institutions polled by the authors. This 
book will find its way into the psychological li- 
brary of most clinics and hospitals; it deserves con- 
sideration also by university teachers of clinical psy- 
chology.—A. M. G. 


Kogan, Leonard S., Hunt, J. McVicker, & Bartelme, 
Phyllis F. A fallow-up study of the results of 
social casework. New York (192 Lexington Ave.) : 
Family Service Association of America, 1953. Pp. 
115. $2.50. 


This is the detailed report of an intensive follow- 
up study of the adjustment of 38 families five years 
after social casework treatment. All accessible mem- 
bers of these families were located and interviewed 
by a psychologist unacquainted with the nature of 
the families’ earlier problems. Practically all for- 
mer clients were found to place a high value on 
the help they had received. Judged improvement 
during the period of casework appears to have been 
sustained after closing of the cases but was not re- 
lated to judged improvement after closing. Psy- 
chologists should find especially valuable the sec- 
tion dealing with methodological and design prob- 
lems of follow-up studies.—E,. L. K. 


Lacey, Oliver L. Statistical methods in experimen- 
tation. New York: Macmillan, 1953. Pp. xi + 
249. $4.50. 


This is a brief introductory text designed to in- 
troduce students to the use of statistics in experi- 


mentation. It is clearly written and well devel- 
oped. Its use, however, will come at an elemen- 
tary, introductory level. It is not designed for the 
sophisticated experimenter.—W. A. H. 


Lindgren, Henry C. Psychology of personal and so- 
cial adjustment. New York: American Book Co., 
1953. Pp. ix + 481. $4.50. 


A book about normal people and their social re- 
lationships, which gives an excellent coverage of 
all phases of human adjustment and mental health. 
While based on psychological findings, the book is 
so well written that previous training in psychology 
is not needed to hold the student’s understanding 
and interest. Each chapter is followed by a con- 
cise summary of the important points, together with 
a list of source materials and suggested readings. 
Salient points are strengthened by excellent illus- 
trations anc cartoons. There is a complete au- 
thor and subject index. Although it is planned 
as a textbook for college students, the book should 
prove interesting reading for anyone wishing to 
gain a better understanding of everyday problems 
in human living.—B. M. L. 


Moreno, J. L. Who shall survive? (Rev. Ed.) 
Beacon, N. Y.: Beacon House, 1953. Pp. cxiv + 
763. $10.00. 


In his 114-page “preludes of the sociometric move- 
ment,” Moreno presents a professional autobiogra- 
phy and concludes it with this apparently cogent in- 


sight: “There is no controversy about my ideas; 
they are universally accepted. I am the controver- 
sy.” There is no statement as to how this large 


volume differs from the 1934 edition, to which he 
refers here as a “bible of human relations” (p. Ixvi). 
Comparison of the two indicates that this is an ex- 
pansion of the basic outline with slightly more dis- 
cussion of sociodrama, group psychotherapy, and 
psychodrama, although much less than one might 
expect from the subtitle and extensive bibliographies 
devoted to them. More attention is given in this 
book than in its predecessor to the broader implica- 
tions of sociometry and sociometric and spontaneity 
theory. The original book presented the theoretical 
bases for sociometry and the use of sociometric tech- 
niques with various groups. Despite the fact that 
the bibliography is greatly expanded (1,300 titles 
as compared to 41 in the first edition) there is no 
attempt to integrate systematically this wealth of 
material into the text proper. The book is an un- 
abashed exposition of Moreno’s ideas and is limited 
largely to a consideration of their influences. Cer- 
tainly Dr. Moreno has made a great and widely 
recognized contribution to the study and effective 
use of the group. However, the reviewer feels that 
these important ideas could have been presented 
with greater effect in fewer pages and would have 
been emiched by integrating them with the out- 
standing work of others which has appeared since 
the date of the first edition—F. McK. 
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Naumberg, Margaret. Psychoneurotic art: its func- 
tion in psychotherapy. New York: Grune & 
Stratton, 1953. Pp. x + 148. $6.75. 


Through the medium of an intensive case study, 
Miss Naumberg describes a psychoanalytically ori- 
ented method of therapy which relies heavily upon 
the symbolic character of the patient’s spontaneous 
art products. The presentation is perceptive and 
reflects the broad experience of a skilled clinician. 
Her own discussion is buttressed by an analysis by 
Adolph Woltmann of the Rorschach responses of a 
single case in relation to the art work produced 
during three years of therapy, and by Piotrow- 
ski’s comments on the Rorschach generally as it re- 
lates to the therapeutic use of graphic art. A crit- 
ical review of the literature on art in clinical prac- 
tice forms a useful appendix. It is obviously vital 
to the clinical culture for new techniques and meth- 
ods to circulate freely. This book admirably illus- 
trates the kind of procedural description that will 
allow other practitioners to adapt Miss Naumberg’s 
approach to their own therapeutic armamentaria. 
When it is concerned with theory, however, it is at 
times fuzzy, at times doctrinaire, failing to provide 
any adequate underpinning of ideas for the practi- 
cal techniques it advances. More concretely, read- 
ers may worry about the soundness of the interpre- 
tations derived from art products, just as they may 
wonder at the applicability of the method to more 
than a fraction of the neurotic population. Miss 
Naumberg’s book represents a genuinely creative 
effort. It deserves wide reading and the serious, 
critical consideration by which private hunches may 


be transformed into useful public knowledge.— 
E. J. S. 
Peters, R. S. (Ed.) Brett’s history of psychology. 


New York: Macmillan, 1953. Pp. 742. $7.50. 


In condensing G. S. Brett’s three volumes into 
one (Vol. I, 1912; Vols. II and III, 1921), Peters 
has retained the original organization and even the 
writing of Brett. Except for the last of 15 chapters, 
editing was done by omitting entire sections and 
adding introductory statements to each chapter. 
Even in its condensed form, this is a formidable 
volume. Except for the new chapter, it is more of 
a history of thought than of psychology. It traces 
in great detail the developments in philosophy, med- 
icine, and the physical sciences that led to psycho- 
logical formulations from earliest recorded history 
to the work of William James. In bringing the 
work up to date, Peters has discarded the original 
chapter on “modern” psychology, and substituted one 
on twentieth-century trends. This summary of the 
history of psychology for the past fifty years con- 
stitutes a thumbnail sketch of some of the more com- 
mon schools of contemporary psychology. This 
chapter does not have the sweep of Brett’s treat- 
ment of earlier history, but the editor was frank to 


467 


recommend other works for fuller treatment of the 
psychology of the past half century. 
ment of Brett’s History renders ai 

ice to psychology by making it more easily availa 
ble to students who, in their involvement with the 
technical aspects of modern psychology, may not be 
M. K. 


The abridge- 
important serv- 


aware of the origins of current movements. 


Piaget, Jean. The origins of intelligence in child 
ren. New York: International 
1953. Pp. xi 419. $6.00. 


Universities Press, 


This book, taken together with La Construction 
du Reel Chez l’Enfant, and La Formation du Sym- 
bole Chez l’Enfant, constitutes a series of works de 


lige nce 


voted to the beginnings of inte Piaget con 
siders with particular care the infantile behavior 
which precedes, but forms the basis for, “intelligent 
behavior: those elementary sensorimotor adaptations 
commonly classified as reflexes and elementary hab 
its. He is then 

circular formation of secondar 
schemata, to account for the development from sim 
ple sensorimotor adjustment to the discovery and it 
vention of new means of adaptation. 


able, by means of the concepts of 


reaction and the 


His empha 
sis on the importance of perceptual behavior in this 
sequence, and on the significance of learning (o1 
“use”) in promoting intellectual development, will 
strike a familiar 

As usual, Piaget shows 


note for American I ychologi 
n this work the careful, i 
genious observation of 
ports of their most fleeting 

make his developmental studies so 
should be that the 
rendering an already compelling text into clear, ap 
pealing, familiar terms. —d. M. G 


infants, aad the detailed re- 


even reactions, which 
stimulating. It 
translation is 


noted excelient 


Senn, Milton J. E. (Ed.) Problems of infancy and 
childhood. Transactions of the Sixth Conference. 
New York: Josiah Macy, Jr. 1953. 


Pp. 160. $2.50. 


I oundation, 


This publi 


ence on Infan 


ed account of the Sixth Macy Confer 
and Childhood brings together thre 
extended discussions of problems of infancy, al 


with records of the interchange foil 


group wing 
each presentation. Escalona reports her observa 
tions of emotional development and mother-infant 
relationship in the first year of life. K. M. Wolf 
presents some of the cases of mothers and infants 


with the 
Stewart describes the 


studied in connection Yale 
project. studies of excessive 
crying in infants which are under way at the Uni 
versity of Washington Child Health Center. Al! 
three papers are notable for their contributions both 
to the phenomenology of infant behavior and to the 
development of dynamic hypotheses to account for 
it. Papers and discussion alike retain their original 
informal, conference form—a characteristic which 
adds markedly to their appeal and readability—A. 
M. G. 


rooming-in 
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Slavson, S. R. Child psychotherapy. New York: 
Columbiz Univer. Press, 1952. Pp. xiii + 332. 
$4.50. 


This book is much more than its title suggests. 
The consideration of therapy with children is pre- 
ceded by two important sections on normal child de- 
velopment in terms of needs, and on psychopathol- 
ogy in childhood, in terms of early primary rela- 
tionships and fixations. The dynamics of therapy 
are then presented as an outgrowth of these views 
of normal and pathological development. The au- 
thor acknowledges his indebtedness to Freud and 
maintains throughout a predominantly psychoana- 
lytic orientation. As in his other books, so here 
Slavson probably makes his most significant contri- 
bution in his careful breakdown of the therapeutic 
process, his attention to the role of parents and 
close relatives in affecting child behavior, and his 
emphasis on group factors in illness and treatment. 
The brief case illustrations which occur in the 
text often seem oversimplified by contrast to the 
theoretical exposition. The somewhat longer ther- 
apeutic record which concludes the book, on the 
other hand, adds considerably to the reader’s un- 
derstanding of the principles involved.—A. M. G. 


Townsend, John C. Introduction to experimental 
method. New York: McGraw-Hill, 1953. Pp. 
ix + 220. $4.00. 


This book is intended to appeal to three categor- 
ies of readers: “First, the undergraduate student 
who is undergoing his first exposure to the rigors 
of the experimental method in psychology and the 
social sciences. Second, the student who, with an 
inadequate background in the application of the 
experimental method, finds himself faced with the 
necessity of ‘doing a piece of research’ to satisfy 
thesis requirements. Third, the social science work- 
er who discovers that his job in industry, the clinic, 
the prison, etc., is one demanding the execution of 
research projects.” The writer's presentation is de- 
liberately superficial; therefore the book’s content 
may not be sufficient to the interests or the re- 
quirements of his second and third groups of read- 
ers. The material could prove useful, however, in 
undergraduate instruction, either as supplementary 
reading for the elementary psychology course or as 
part of the assigned material for the undergradu- 
ate laboratory. The content of the book is con- 
cerned with experimentation as traditionally em- 
ployed in psychology. Chapters 2 and 9, which 
have to do with causal sequences and methods of 
inference, are likely to contribute to clear thinking 
among undergraduates without the care and tedium 
which comparable material in a book on the logic 
of scientific method would require. Two other 
chapters, 7 and 8, which have to do with the con- 
trol of the experiment and procedure for experimen- 
tation, will be useful to the casual undergraduate 
student of psychology. Other chapters have to do 
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with such matters as variables, controls, and appa- 
ratus. The meaning of theory and the problem of 
theory building, which are important in experiment- 
al psychology, are for the most part either ignored 
or evaded by reference te other works. This kind 
of book could appeal to a wider audience if it were 
possible to prepare it in a manner more scholarly, 
thorough, and explicit. Apparently, however, the 
author considers his present superficial treatment 
best for his purposes.—J. R. W. 


Books Received 
Berdie, Ralph F., Layton, Wilbur L., & Hagenah, 
Theda. Using tests in counseling. Minneapolis: 
Student Counseling Bureau of Univer. of Minn., 


1953. Pp. v + 86. Paper, $1.00. 

Berkman, Tessie D. Practice of social workers in 
psychiatric hospitals and clinics. New York 
(1860 Broadway): Amer. Assoc. of Psychiatric 
Social Workers, 1953. Pp. ix + 158. Paper, 
$2.00. 

Bosselman, Beulah Chamberlain. The troubled 


mind. New York: Ronald Press, 1953. Pp. v 
206. $3.50. 

Burgess, Ernest W., & Wallin, Paul. 
and marriage. 
Pp. xii + 819. 

Fremont-Smith, Frank. (Ed.) Health and human 
relations. Report of Josiah Macy Jr. Foundation 
Conference, August 2-7, 1951. New York: Blak- 
iston, 1953. Pp. xx + 192. $4.00. 

Gorlow, Leon, Hoch, Erasmus L., & Telschow, Earl 
F. The nature of nondirective group psychother- 


Engagement 
Philadelphia: Lippincott, 1953. 


apy. New York: Teachers Coll., Columbia Uni- 
ver., Bureau of Publications, 1952. Pp. xi + 
143. $3.25. 


Hoerr, Normand L., & Osol, Arthur. Jilustrated 
pocket medical dictionary. New York: Blakiston, 
1952. Pp. xvi + 1005 + 24 figs. $3.75. 

Jersild, Arthur T. In search of self. Teachers Coll., 
Columbia Univer., Bureau of Publications, 1952. 
Pp. xii + 145. $2.75. 

Jersild, Arthur T., Helfant, Kenneth, & associates. 
Education for self-understanding. New York: 
Teachers Coll., Columbia Univer., Bureau of 
Publications, 1953. Pp. ix + 54. Paper, 85c. 

Katz, Barney, & Lehner, George F. J. Mental hy- 
giene in modern living. Nev: York: Ronald Press, 
1953. Pp. x + 544. $4.50. 

Kubie, Susan H., & Landau, Gertrude. Group work 
with the aged. New York: International Univer- 
sities Press, 1953. Pp. 214. $3.50. 

Mettler, Fred A. (Ed.) Psychosurgical problems. 
New York: Blakiston, 1952. Pp. xii + 357. 
$7.00. 

Piaget, Jean. The child’s conception of number. 
New York: Humanities Press, 1952. Pp. ix + 
248. $5.00. 

Recktenwald, Lester Nicholas. Guidance and coun- 
seling. Washington, D. C.: Catholic Univer. of 
Amer. Press, 1953. Pp. xv + 192. 
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Aunouncement.. . 


Coming in January, 1954— 
the first ANNUAL of PASTORAL PSYCHOLOGY 


PASTORAL PSYCHOLOGY, a monthly periodical devoted to the practical 
synthesis of the principles and techniques of clinical psychology, dynamic psychi- 
atry, and psychiatric social work with spiritual and religious values, will publish 
in January its first ANNUAL, devoted entirely to a listing of significant reference 
and resource material foz the minister, clinical psychologist, psychiatrist, and a’! 
other workers in the field of humar: behavior. 


A large section of the ANNUAL will be devoted to « special listing and descrip- 
tion of significant books published within recent years on psychology, psychiatry, and 
counseling, organized and graded by Professor Seward Hiltner and several mem- 
bers ot our Editorial Advisory Board, on the basis of the reading level and equip- 
ment of the individual reader. It will also contain a listing of mental health films 
and plays. and an article on readings in psychoanalysis with a listing of the out- 
standing books in the field, with particular emphasis on the reading of Sigmund 
Freud’s work. 


In addition, the ANNUAL will contain a listing of psychiatric services, such as 
resources for clinical training, resources for >sychiatric treatment of children and 
adults, marriage counseling services, a listing of private and public treatment re- 
sources for children with behavior disorders, private psychiatric hospitals, resources 
for the treatment of alcoholics, etc. The ANNUAL will also contain a glossary 
of psychiatric technical words which appear frequently in the literature, as well 
as an Index of materials which appeared in PASTORAL PSYCHOLOGY 
during the past year. 


Individual issues will be on sale at $1.00. Special quantity prices will be as fol- 
lows: 











RT Se $1.00 per copy 

5 to 24 copiew ee. .-..-.--...-.2-.------- $0.75 per copy 

AS 20 FP COP esti Sensei resresice $0.60 per copy 

ps Ug ESE Gh eee $0.50 per copy 
PASTORAL PSYCHOLOGY 


Great Neck, N. Y. 


my 
Please enter out otder for ......... copies of the ANNUAL of PASTORAL PSY- 
CHOLOGY @ $...... per copy: { ] Check enclosed. [ } Bill when shipped. 
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Here for the first time is a detailed account 
of an actual therapeutic case 


STEPS IN PSYCHOTHERAPY 


John Dollard, Frank Auld, Jr., & Alice M. White 


This beok shows the student what ac- 
tually happens in psychotherapy, us- 
ing, as illustrative materia!, a detailed 
account of the handling of a thera- 
peutic case by a student-therapist 
wih the supervisor's running com- 
mentary on the stuuent’s methods. Fal- 
lewing the presertation of the case, 


$3.50 


there is a discussion of what the steps 
were; then a discussion of 
\flicts im marriage; and 
finally, a discussion of the psychologi- 
cal tests given to this pstie xt and the 
ways in ch their recu's might be 
used. Publis 2 in Sete... 


in this cas¢ 
sex-fear co 


222 rp 


The Macmillan Compr Las 


60 FIFTH AVENUE, NEW YORK 1) Y 
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